U.S. patent number 11,001,853 [Application Number 16/917,028] was granted by the patent office on 2021-05-11 for carbon fixation systems in plants and algae. This patent grant is currently assigned to NMC, INC.. The grantee listed for this patent is NMC, INC.. Invention is credited to Natalia Friedland, Richard Thomas Sayre, Somya S. Subramanian.
View All Diagrams
United States Patent | 11,001,853 |
Sayre , et al. | May 11, 2021 |
Carbon fixation systems in plants and algae
Abstract
Provided are heterologous nucleic acid constructs, vectors andmethods for elevating cyclic electron transfer activity, improvingcarbon concentration, and enhancing carbon fixation in C3 and C4plants, and algae, and producing biomass or other products from C3or C4 plants, and algae, selected from among, for example,starches, oils, fatty acids, lipids, cellulose or othercarbohydrates, alcohols, sugars, nutraceuticals, pharmaceuticals,fragrance and flavoring compounds, and organic acids, as well astransgenic plants produced thereby. These methods and transgenicplants and algae encompass the expression, or overexpression, ofvarious combinations of genes that improve carbon concentratingsystems in plants and algae, such as bicarbonate transportproteins, carbonic anhydrase, light driven proton pump, cyclicelectron flow regulators, etc.
Inventors: | Sayre; Richard Thomas (LosAlamos, NM), Subramanian; Somya S. (Los Alamos, NM),Friedland; Natalia (Palo Alto, CA) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Applicant: |
| ||||||||||
Assignee: | NMC, INC. (Los Alamos,NM) | ||||||||||
FamilyID: | 53773567 | ||||||||||
Appl.No.: | 16/917,028 | ||||||||||
Filed: | June 30, 2020 |
Prior Publication Data
DocumentIdentifier | Publication Date | |
---|---|---|
US 20200347397 A1 | Nov 5, 2020 | |
Related U.S. Patent Documents
ApplicationNumber | Filing Date | Patent Number | Issue Date | ||
---|---|---|---|---|---|
16358331 | Mar 19, 2019 | 10696977 | |||
15411854 | Mar 19, 2019 | 10233458 | |||
PCT/US2015/041617 | Jul 22, 2015 | ||||
62027354 | Jul 22, 2014 | ||||
Current U.S.Class: | 1/1 |
Current CPCClass: | C12P1/00(20130101); C12N 15/8261(20130101); C12N15/8269(20130101); C12N 1/12(20130101); C07K14/415(20130101); Y02A 40/146(20180101) |
Current InternationalClass: | C12N15/82(20060101); C07K 14/415(20060101); C12N1/12(20060101); C12P 1/00(20060101) |
References Cited [Referenced By]
U.S. Patent Documents
4554101 | November 1985 | Hopp |
5164316 | November 1992 | McPherson et al. |
5196525 | March 1993 | McPherson et al. |
5322938 | June 1994 | McPherson et al. |
5352605 | October 1994 | Fraley et al. |
5359142 | October 1994 | McPherson et al. |
5424200 | June 1995 | McPherson et al. |
5510474 | April 1996 | Quail et al. |
5589583 | December 1996 | Klee et al. |
5593874 | January 1997 | Brown et al. |
5599686 | February 1997 | Defeo-Jones et al. |
5641876 | June 1997 | McElroy et al. |
5659122 | August 1997 | Austin |
6784340 | August 2004 | Aoyama et al. |
6989265 | January 2006 | Blattner et al. |
7053205 | May 2006 | Verdaguer et al. |
7303906 | December 2007 | Blattner et al. |
8039243 | October 2011 | Blattner |
8043842 | October 2011 | Blattner et al. |
8119365 | February 2012 | Blattner et al. |
8178339 | May 2012 | Campbell et al. |
10233458 | March 2019 | Sayre |
10696977 | June 2020 | Sayre et al. |
10696979 | June 2020 | Christensen |
2011/0256605 | October 2011 | Liphardt et al. |
2012/0219994 | August 2012 | Blattner et al. |
2013/0007916 | January 2013 | Spalding |
2019/0203222 | July 2019 | Sayre et al. |
Foreign Patent Documents
0507698 | Oct 1992 | EP | |||
0633317 | Jan 1995 | EP | |||
1483367 | May 2010 | EP | |||
84/02913 | Aug 1984 | WO | |||
87/007644 | Dec 1987 | WO | |||
95/006742 | Mar 1995 | WO | |||
96/06932 | Mar 1996 | WO | |||
97/48819 | Dec 1997 | WO | |||
2004/053135 | Jun 2004 | WO | |||
07/098042 | Aug 2007 | WO | |||
2012/101118 | Aug 2012 | WO | |||
2012/125737 | Sep 2012 | WO | |||
2017/218959 | Dec 2017 | WO | |||
Other References
Ainley, W. Michael, et al., "Regulatable endogenous production ofcytokinins up to `toxic` levels in transgenic plants and planttissues", Plant Molecular Biology, vol. 22, 1993, 13-23. cited byapplicant .
Allen, Doug K., et al., "Carbon and Nitrogen Provisions Alter theMetabolic Flux in Developing Soybean Embryos", Plant Physiology,vol. 161, 2013, 1458-1475. cited by applicant .
Allen, Doug K., et al., "Comparing Photosynthetic and PhotovoltaicEfficiencies and Recognizing the Potential for Improvement",Phytochemistry, vol. 68, 2007, 2197-2210. cited by applicant .
Allen, Doug K., et al., "Isotope labelling of Rubisco subunitsprovides in vivo information on subcellular biosynthesis andexchange of amino acids between compartments", Plant, Cell andEnvironment, Vo. 35, 2012, 1232-1244. cited by applicant .
Allen, Doug K., et al., "Metabolic flux analysis in plants: copingwith complexity", Plant, Cell and Environment, vol. 32, 2009,1241-1257. cited by applicant .
Allen, Doug K., et al., "Quantification of Peptide m/zDistributions from 13C-Labeled Cultures with High-Resolution MassSpectrometry", Anal. Chem., vol. 86, 2014, 1894-1901. cited byapplicant .
Allen, Doug K., et al., "The role of light in soybean seed fillingmetabolism", The Plant Journal, vol. 58, 2009, 220-234. cited byapplicant .
Alric, Jean , "Cyclic electron flow around photosystem I inunicellular green algae", Photosynth Res, vol. 106, 2010, 47-56.cited by applicant .
Altschul, Stephen F., et al., "Basic Local Alignment Search Tool",J. Mol. Biol., vol. 215, 1990, 403-410. cited by applicant .
Altschul, Stephen F., et al., "Gapped Blast and PSI-BLAST: a newgeneration of protein database search programs", Nucleic AcidsResearch, vol. 25, No. 17, 1997, 3389-3402. cited by applicant.
Amunts, Alexey , et al., "The structure of a plant photosystem Isupercomplex at 3.4A resolution", Nature, vol. 447, 2007, 58-63.cited by applicant .
Arrivault, Stephanie , "Use of reverse-phase liquid chromatography,linked to tandem mass spectrometry, to profile the Calvin cycle andother metabolic intermediates in Arabidopsis rosettes at differentcarbon dioxide concentrations", The Plant Journal, vol. 59, 2009,824-839. cited by applicant .
Avsian-Kretchmer, Orna , et al., "The Salt-Stress SignalTransduction Pathway That Activates the gpx1 Promoter Is Mediatedby Intracellular H2O2, Different from the Pathway Induced byExtracellular H2O2", Plant Physiology, vol. 135, 2004, 1685-1696.cited by applicant .
Baumann, Kim , et al., "The DNA Binding Site of the Dof ProteinNtBBF1 Is Essential for Tissue-Specific and Auxin-RegulatedExpression of the roIB Oncogene in Plants", The Plant Cell, vol.11, 1999, 323-333. cited by applicant .
Beja, Oded , et al., "Bacterial Rhodopsin: Evidence for a New Typeof Phototrophy in the Sea", Science, vol. 289, 2000, 1902-1906.cited by applicant .
Benfey, Philip N., et al., "The CaMV 35S enhancer contains at leasttwo domains which can confer different developmental andtissuespecific expression patterns", The EMBO Journal, vol. 8, No.8, 1989, 2195-2202. cited by applicant .
Bihmidine, Saadia , et al., "Regulation of assimilate import intosink organs: update on molecular drivers of sink strength",Frontiers in Plant Science, vol. 4, Issue 177, 2013, 1-15. cited byapplicant .
Blanco, Nicolas E., et al., "Expression of the Minor Isoform PeaFerredoxin in Tabacco Alters Photosynthetic Electron Partitioningand Enhances Cyclic Eletrcon Flow", Plant Physiology, vol. 161,2013, 866-879. cited by applicant .
Blankenship, Robert E., "Comparing Photosynthetic and PhotovoltaicEfficiencies and Recognizing the Potential for Improvement",Science, vol. 332, 2011, 805-809. cited by applicant .
Blazquez, Miguel A., et al., "Gibberellins Promote Flowering ofArabidopsis by Activating the Leafy Promoter", The Plant Cell, vol.10, 1998, 791-800. cited by applicant .
Blume, Beatrix , et al., "Expression of ACC oxidase promoter-GUSfusions in tomato and Nicotiana plumbaginifolia regulated bydevelopmental and environmental stimuli", The Plant Journal, vol.12, No. 4, 1997, 731-746. cited by applicant .
Breyton, Cecile , et al., "Redox Modulation of Cyclic Electron Flowaround Photosystem I in C3 Plants", Biochemistry, vol. 45, 2006,13465-13475. cited by applicant .
Buchel, Annemarie , et al., "Mutation of GT-1 binding sites in thePr-1A promoter influences the level of inducible gene expression invivo", Plant Molecular Biology, vol. 40, No. 3, 1999, 387-396.cited by applicant .
Busch, Karin B., et al., "Dynamics of bioenergeticmicrocompartments", Biol. Chem., vol. 394, No. 2, 2013, 163-188.cited by applicant .
Busk, Peter Kamp, et al., "Regulatory elements in vivo in thepromoter of the abscisic acid responsive gene rab17 from maize",The Plant Journal, vol. 11, No. 6, 1997, 1285-1295. cited byapplicant .
Callis, Judy , et al., "Introns increase gene expression incultured maize cells", Genes & Development, vol. 1, 1987,1183-1200. cited by applicant .
Cao, Yi , et al., "Relationship of Proton Release at theExtracellular Surface to Deprotonation of the Schiff Base in theBacteriorhodopsin Photocycle", Biophysical Journal, vol. 68, 1995,1518-1530. cited by applicant .
Cardol, Pierre , et al., "Regulation of electron transport inmicroalgae", Biochimica et Biophysica Acta, vol. 1807, 2011,912-918. cited by applicant .
Cardon, Guillermo H., et al., "Functional analyis of theArabidopsis thaliana SBP-box gene SPL3: a novel gene involved inthe floral transition", The Plant Journal, vol. 12, No. 2, 1997,367-377. cited by applicant .
Carrillo, Humberto , et al., "The Multiple Sequence AlignmentProblem in Biology", SIAM Journal on Applied Mathematics, vol. 48,No. 5, 1988, 1073-1082. cited by applicant .
Carrington, James C., et al., "Cap-Independent Enhancement ofTranslation by a Plant Potyvirus 5' Nontranslated Region", Journalof Virology, vol. 64, No 4, 1990, 1590-1597. cited by applicant.
Chaubet-Gigot, Nicole , et al., "Tissue-dependent enhancement oftransgene expression by introns of replacement histone H3 genes ofArabidopsis", Plant Molecular Biology, vol. 45, 2001, 17-30. citedby applicant .
Chen, Wenqiong , et al., "The promoter of a H2O2-inducible,Arabidopsis glutathione S-transferase gene contains closely closelylinked OBF- and OBP1-binding sites", The Plant Journal, vol. 10,No. 6, 1996, 955-966. cited by applicant .
Choi, Jungik , et al., "Tandemmassspectrometry:Anovelapproachformetabolicfluxanalysis", MetabolicEngineering, vol. 13, 2011, 225-233. cited by applicant .
Clancy, Maureen , et al., "Splicing of the Maize Sh1 First IntronIs Essential for Enhancement of Gene Expression, and a T-Rich MotifIncreases Expression without Affecting Splicing", Plant Physiol.vol. 130, 2002, 918-929. cited by applicant .
Claverie, Jean-Michel , "Information Enhancement Methods for LargeScale Sequence Analysis", Computers Che., vol. 17, No. 2, 1993,191-201. cited by applicant .
Dalcorso, Giovanni , et al., "A Complex Containing PGRL1 and PGR5Is Involved in the Switch Between Linear and Cyclic Electron Flowin Arabidopsis", Cell 132, 2008, 273-285. cited by applicant .
Dalcorso, Giovanni , et al., "A Complex Containing PGRL1 and PGR5Is Involved in the Switch between Linear and Cyclic Electron Flowin Arabidopsis", Cell, vol. 132, 2008, 273-285. cited by applicant.
Datla, Raju S.S., et al., "Improved high-level constitutive foreigngene expression in plants using an AMV RNA4 untranslated leadersequence", Plant Science, vol. 94, 1993, 139-149. cited byapplicant .
De Veylder, Lieven , et al., "Herbicide Safener-Inducible GeneExpression in Arabidopsis thaliana", Plant Cell Physiol., vol. 38,No. 5, 1997, 568-577. cited by applicant .
Dioumaev, Andrei K., et al., "Proton Transfers in the PhotochemicalReaction Cycle of Proteorhodopsin", Biochemistry, vol. 41,5348-5358, 2002. cited by applicant .
Duanmu, Deqiang , et al., "Knockdown of limiting-CO2-induced geneHLA3 decreases HCO3 transport and photosynthetic Ci affinity inChlamydomonas reinhardtii", PNAS, vol. 106, No. 14, 2009,5990-5995. cited by applicant .
Duckwall, Casey Scott, et al., "Mapping cancer cell metabolism with13C flux analysis: Recent progress and future challenges", Journalof Carcinogenesis, vol. 12, No. 13, 2013, 1-7. cited by applicant.
Egnatchik, R. A., et al., "Palmitate-induced activation ofmitochondrial metabolism promotes oxidative stress and apoptosis inH4IIEC3 rat hepatocytes", Metabolism, vol. 62, No. 2, 2014,283-295. cited by applicant .
Elleby, Bjorn , et al., "Characterization of carbonic anhydrasefrom Neisseria gonorrhoeae", Eur. J. Biochem, vol. 268, 2001,1613-1619. cited by applicant .
Fabre, Nicolas , et al., "Characterization and expression analysisof genes encoding a and b carbonic anhydrases in Arabidopsis",Plant, Cell and Environment, vol. 30, 2007, 617-629. cited byapplicant .
Farquhar, G. D., et al., "Carbon Isotope Discrimination andPhotosynthesis", Annu. Rev. Plant Physiol. Plant Mol. Biol., vol.40, 1989, 503-537. cited by applicant .
Friedrich, Thomas , et al., "Proteorhodopsin is a Light-drivenProton Pump with Variable Vectoriality", J. Mol. Biol., vol. 321,2002, 821-838. cited by applicant .
Froehlich John E., et al., "The role of the transmembrane domain indetermining the targeting of membrane proteins to either the innerenvelope or thylakoid membrane", The Plant Journal, vol. 68, 2011,844-856. cited by applicant .
Furbank, Robert T., et al., "C4 rice: a challenge for plantphenomics", Functional Plant Biology, vol. 36, No. 11, 2009,845-856. cited by applicant .
Wunsche, Jens N., et al., "Physiological and biochemical leaf andtree responses to crop load in apple", Tree Physiology, vol. 25,2005, 1253-1263. cited by applicant .
Yamaguchi-Shinozaki, Kazuko , et al., "A Nove1 cis-Acting Elementin an Arabidopsis Gene 1s Involved in Responsiveness to Drought,Lowqemperature, or High-Salt Stress", The Plant Cell, vol. 6, 1994,251-264. cited by applicant .
Young, Jamey D., et al., "An Elementary Metabolite Unit (EMU) BasedMethod of Isotopically Nonstationary Flux Analysis", Biotechnologyand Bioengineering, vol. 99, No. 3, 2008, 686-699. cited byapplicant .
Young, Jamey D., "INCA: a computational platform for isotopicallynon-stationary metabolic flux analysis", Bioinformatics, vol. 30,No. 9, 2014, 1333-1335. cited by applicant .
Young, Jamey D., et al., "Isotopomer Measurement Techniques inMetabolic Flux Analysis II: Mass Spectrometry", Methods inMolecular Biology, vol. 1083, Chapter 7, 2014, 85-108. cited byapplicant .
Young, Jamey D., "Mapping photoautotrophic metabolism withisotopically nonstationary 13C flux analysis", MetabolicEngineering, vol. 13, 2011, 656-665. cited by applicant .
Young, Jamey D., "Metabolic flux rewiring in mammalian cellcultures", Current Opinion in Biotechnology, vol. 24, 2013,1108-1115. cited by applicant .
Zhu, Xin-Guang , et al., "C4 Rice--an Ideal Arena for SystemsBiology Research", Journal of Integrative Plant Biology, vol. 52,Issue 8, 2010, 762-770. cited by applicant .
Zuo, Jianru , et al., "An estrogen receptor-based transactivatorXVE mediates highly inducible gene expression in transgenicplants", The Plant Journal, vol. 24, No. 2, 2000, 265-273. cited byapplicant .
Okutani, Satoshi , "Three Maize Leaf Ferredoxin:NADPHOxidoreductases Vary in Subchloroplast Location, Expression, andInteraction with Ferredoxin", Plant Physiology, vol. 139, 2005,1451-1459. cited by applicant .
Outchkourov, N. S., et al., "The promoter-terminator ofchrysanthemum rbcS1 directs very high expression levels in plants",Planta, vol. 216, 2003, 1003-1012. cited by applicant .
Parry, Martin A.J., et al., "Rubisco activity and regulation astargets for crop improvement", Journal of Experimental Botany, vol.64, No. 3, 2013, 717-730. cited by applicant .
Paul, Matthew J., et al., "Sink regulation of photosynthesis",Journal of Experimental Botany, vol. 52, No. 360, 2001, 1383-1400.cited by applicant .
Peltier, Gilles , "Auxiliary electron transport pathways inchloroplasts of microalgae", Photosynth Res, vol. 106, 2010, 19-31.cited by applicant .
Peng, Lianwei , et al., "Supercomplex Formation with Photosystem IIs Required for the Stabilization of the Chloroplast NADPHDehydrogenase-Like Complex in Arabidopsis", Plant Physiology, vol.155, 2011, 1629-1639. cited by applicant .
Perrine, Zoee , et al., "Optimization of photosynthetic lightenergy utilization by microalgae", Algal Research, vol. 1, 2012,134-142. cited by applicant .
Peterhansel, Christoph , et al., "Photorespiratory bypasses: howcan they work?", Journal of Experimental Botany, vol. 64, No. 3,2013, 709-715. cited by applicant .
Price, G. Dean, et al., "The cyanobacterial CCM as a source ofgenes for improving photosynthetic CO2 fixation in crop species",Journal of Experimental Botany, vol. 64, No. 3, 2013, 753-768.cited by applicant .
Reeck, Gerald R., et al., ""Homology" in Proteins and NucleicAcids: A Terminology Muddle and a Way out of It", Cell, vol. 50,1987, 667. cited by applicant .
Reiser, Leonore , et al., "The BELL7 Gene Encodes a HomeodomainProtein Involved in Pattern Formation in the Arabidopsis OvulePrimordium", Cell, vol. 83, 1995, 735-742. cited by applicant .
Ringli, Christoph , et al., "Specific interaction of the tomatobZIP transcription factor VSF-1 with a non-palindromic DNA sequencethat controls vascular gene expression", Plant Molecular Biology,vol. 37, 1998, 977-988. cited by applicant .
Romisch-Margl, Werner , et al., "13CO2 as a universal metabolictracer in isotopologue perturbation experiments", Phytochemistry,vol. 68, 2007, 2273-2289. cited by applicant .
Roslan, Hairul A., et al., "Characterization of theethanol-inducible alc geneexpression system in Arabidopsisthaliana", The Plant Journal, vol. 28, No. 2, 2001, 225-235. citedby applicant .
Sage, Tammy L., et al., "The Functional Anatomy of Rice Leaves:Implications for Refixation of Photorespiratory CO2 and Efforts toEngineer C4 Photosynthesis into Rice", Plant Cell Physiol. vol. 50,No. 4, 2009, 756-772. cited by applicant .
Sage, Rowan F., "Variation in the Kcat of Rubisco in C3 and C4plants and some implications for photosynthetic performance at highand low temperature", Journal of Experimental Botany, vol. 53, No.369, 2002, 609-620. cited by applicant .
Sakai, Tatsuya , et al., "Analysis of the Promoter of theAuxin-Inducible Gene, parC, of Tobacco", Plant Cell Physiol., vol.37, No. 7, 1996, 906-913. cited by applicant .
Sakamoto, Masahiro , et al., "Structure and Characterization of aGene for Light-Harvesting Chi a/b Binding Protein from Rice", PlantCell Physiol., vol. 32, No. 3, 1991, 385-393. cited by applicant.
Samac, Deborah A., et al., "A comparison of constitutive promotersfor expression of transgenes in alfalfa (Medicago sativa)",Transgenic Research, vol. 13, 2004, 349-361. cited by applicant.
Sanger, Margaret , et al., "Characteristics of a strong promoterfrom figwort mosaic virus: comparison with the analogous 35Spromoter from cauliflower mosaic virus and the regulated mannopinesynthase promoter", Plant Molecular Biology, vol. 14, 1990,433-443. cited by applicant .
Schaffner, Anton R., et al., "Maize rbcS Promoter Activity Dependson Sequence Elements Not Found in Dicot rbcS Promoters", Ihe PlantCell, vol. 3, 1991, 997-1012. cited by applicant .
Sekiyama, Yasuyo , et al., "Towards dynamic metabolic networkmeasurements by multi-dimensional NMR-based fluxomics",Phytochemistry, vol. 68, 2007, 2320-2329. cited by applicant .
Shastri, Avantika A., et al., "A transient isotopic labelingmethodology for 13C metabolic flux analysis of photoautotrophicmicroorganisms", Phytochemistry, vol. 68, 2007, 2302-2312. cited byapplicant .
Sheen, Jen , "Ca2+-dependent protein kinases and stress signaltransduction in plants", Science, vol. 274, No. 5294, 1996,1900-1902. cited by applicant .
Shi, Rebecca , et al., "Engineering Oryza sativa to Express thePhoteorhodopsin Photosystem",http://openwetware.org/wiki/20.109(F12):Mod3_OrangeTR_Pre-proposal,2012, 1-4. cited by applicant .
Shi, Lifang , et al., "Gibberellin and abscisic acid regulate GAST1expression at the level of transcription", Plant Molecular Biology,vol. 38, 1998, 1053-1060. cited by applicant .
Shikanai, Toshiharu , "Central role of cyclic electron transportaround photosystem I in the regulation of photosynthesis", CurrentOpinion in Biotechnology, vol. 26, 2014, 25-30. cited by applicant.
Si, Li-Zhen , et al., "Isolation of a 1 195 bp 5-Flanking Region ofRice Cytosolic Fructose-1, 6-bisphosphatase and Analysis of ItsExpression in Transgenic Rice", Acta Botanica Sinica, vol. 3, 2003,359-364. cited by applicant .
Siebertz, Barbara , et al., "cis-Analysis of the Wound-InduciblePromoter wun7 in Transgenic Tobacco Plants and HistochemicalLocalization of Its Expression", The Plant Cell, vol. 1, 1989,961-968. cited by applicant .
Sinclair, Thomas R., "Historical Changes in Harvest Index and CropNitrogen Accumulation", Crop Science, vol. 38, No. 3, 1998,638-643. cited by applicant .
Slewinski, Thomas L., et al., "Current perspectives on theregulation of whole-plant carbohydrate partitioning", PlantScience, vol. 178, 2010, 341-349. cited by applicant .
Sonnewald, U. , "Mianipulation of sink-source relations intransgenic plants", Plant, Cell and Environment, vol. 17, 1994,649-658. cited by applicant .
Sonnewald, Uwe , et al., "Molecular Approaches to Sink-SourceInteractions", Plant Physiol., vol. 99, 1992, 1267-1270. cited byapplicant .
Srour, Orr , et al., "Fluxomers: a new approach for 13C metabolicflux analysis", BMC Systems Biology, vol. 5, No. 129, 2011, 1-15.cited by applicant .
Stange, Claudia , et al., "Phosphorylation of nuclear proteinsdirects binding to salicylic acid-responsive elements", The PlantJournal, vol. 11, No. 6, 1997, 1315-1324. cited by applicant .
Streit, Wolfgang R., et al., "A Biotin-Regulated Locus, bioS, in aPossible Survival Operon of Rhizobium meliloti", MPMI vol. 10, No.7, 1997, 933-937. cited by applicant .
Subramanian, Sowmya , "Comparative energetics and kinetics ofautotrophic lipid and starch metabolism in chlorophytic microalgae:implications for biomass and biofuel production", Biotechnology forBiofuels, vol. 6, No. 150, 2013, 1-12. cited by applicant .
Suorsa, Marjaana , et al., "PGR5-PGRL1-Dependent Cyclic ElectronTransport Modulations Linear Electron Transport Rate in Arabidopsisthaliana", Molecular Plant, vol. 9, 2016, 271-288. cited byapplicant .
Sweetlove, L. J., et al., "Source metabolism dominates the controlof source to sink carbon flux in tuberizing potato plantsthroughout the diurnal cycle and under a range of environmentalconditions", Plant, Cell and Environment, vol. 23, 2000, 523-529.cited by applicant .
Szecowka, Marek , et al., "Metabolic Fluxes in an IlluminatedArabidopsis Rosette", The Plant Cell, vol. 25, 2013, 694-714. citedby applicant .
Takahashi, Hiroko , et al., "Cyclic electron flow isredox-controlled but independent of state transition", NatureCommunication, vol. 4, No. 1954, 2013, 1-8. cited by applicant.
Van Der Kop, Dianne A.M., et al., "Selection of Arabidopsis mutantsoverexpressing genes driven by the promoter of an auxin-inducibleglutathione S-transferase gene", Plant Molecular Biology, vol. 39,1999, 970-990. cited by applicant .
Victorio, Reynaldo G., et al., "Growth, Partitioning, and HarvestIndex of Tuber-Bearing Solanum Genotypes Grown in Two ContrastingPeruvian Environments", Plant Physiol., vol. 82, 1986, 103-108.cited by applicant .
Vos, J. , "The nitrogen response of potato (Solanum tuberosum L.)in the field: nitrogen uptake and yield, harvest index and nitrogenconcentration.", Potato Research, vol. 40, 1997, 237-248. cited byapplicant .
Walter, Jessica M., et al., "Light-powering Escherichia coli withproteorhodopsin", PNAS, vol. 104, No. 7, 2007, 2408-2412. cited byapplicant .
Walter, Jessica M., et al., "Potential of light-harvesting protonpumps for bioenergy applications", Current Opinion inBiotechnology, vol. 21, 2010, 265-270. cited by applicant .
Wang, Yingjun , et al., "Carbon dioxide concentrating mechanism inChlamydomonas reinhardtyy: inorganic carbon transport and CO2recapture", Photosynth Res, vol. 109, 2011, 115-122. cited byapplicant .
Weber, Andreas PM, et al., "Plastid transport and metabolism of C3and C4 plants--comparative analysis and possible biotechnologicalexploitation", Current Opinion in Plant Biology, vol. 13, 2010,257-265. cited by applicant .
Willmott, Ruth L., et al., "DNase1 footprints suggest theinvolvement of at least three types of transcription factors in theregulation of alpha-Amy2/A by gibberellin", Plant MolecularBiology, vol. 38, 1998, 817-825. cited by applicant .
Wootton, John C., et al., "Statistics of Local Complexity in AminoAcid Sequences and Sequence Databases", Computers Chem. vol. 17,No. 2, 1993, 149-163. cited by applicant .
Gatz, C. , et al., "Chemical Control of Gene Expression", Annu.Rev. Plant Physiol. Plant Mol. Biol., vol. 48, 1997, 89-108. citedby applicant .
Goldschmidt, Eliezer E., et al., "Regulation of Photosynthesis byEnd-Product Accumulation in Leaves of Plants Storing Starch,Sucrose, and Hexose Sugars", Plant Physiol., vol. 99, 1992,1443-1448. cited by applicant .
Govindjee, Rajni , et al., "Arginine-82 Regulates the PKa of theGroup Responsible for the Light-Driven Proton Release inBacteriorhodopsin", Biophysical Journal, vol. 71, 1996, 1011-1023.cited by applicant .
Govindjee, Rajni , et al., "Mutation of a Surface Residue,Lysine-129, Reverses the Order of Proton Release and Uptake inBacteriorhodopsin; Guanidine Hydrochloride Restores It",Biophysical Journal, vol. 72, 1997, 886-898. cited by applicant.
Govindjee, Rajni , et al., "The Quantum Efficiency of ProtonPumping by the Purple Membrane of Halobacterium Halobium", Biophys.J., vol. 30, 1980, 231-242. cited by applicant .
Guevara-Garcia, Arturo , et al., "A 42 bp fragment of the pmas10promoter containing an ocs-like element confers a developmental,wound- and chemically inducible expression pattern", PlantMolecular Biology, vol. 38, 1998, 743-753. cited by applicant .
Hanke, Guy Thomas, et al., "Multiple iso-proteins of FNR inArabidopsis : evidence for different contributions to chloroplastfunction and nitrogen assimilation", Plant, Cell and Environment,vol. 28, 2005, 1146-1157. cited by applicant .
Harpster, Mark H., et al., "Relative strengths of the 35Scaliflower mosaic virus, 1', 2', and nopaline synthase promoters intransformed tobacco sugarbeet and oilseed rape callus tissue", MolGen Genet, vol. 212, 1988, 182-190. cited by applicant .
Hausler, Rainer E., "Overexpression of C4-cycle enzymes intransgenic C3 plants: a biotechnilogical approach to improveC3-photosynthesis", Journal of Experimental Botany, vol. 53, No.369, 2002, 591-607. cited by applicant .
Hay, R. K. M., et al., "Variation in the harvest index of tropicalmaize: evaluation of recent evidence from Mexico and Malawi", Ann.appl. Biol., vol. 138, 2001, 103-109. cited by applicant .
Henikoff, Steven , et al., "Amino acid substitution matrices fromprotein blocks", Proc. Natl. Acad. Sci. USA, vol. 89, 1992,10915-10919. cited by applicant .
Henkes, Stefan , et al., "A Small Decrease of Plastid TransketolaseActivity in Antisense Tobacco Transformants Has Dramatic Effects onPhotosynthesis and Phenylpropanoid Metabolism", The Plant Cell,vol. 13, 2001, 535-551. cited by applicant .
Hertle, Alexander P., et al., "PGRL1 Is the ElusiveFerredoxin-Plastoquinone Reductase in Photosynthetic CyclicElectron Flow", Molecular Cell, vol. 49, 2013, 511-523. cited byapplicant .
Holtorf, Sonke , et al., "Comparison of different constitutive andinducible promoters for the overexpression of transgenes inArabidopsis thaliana", Plant Molecular Biology, vol. 29, 1995,637-646. cited by applicant .
Huege, Jan , et al., "GC-EI-TOF-MS analysis of in vivocarbon-partitioning into soluble metabolite pools of higher plantsby monitoring isotope dilution after 13CO2 labelling",Phytochemistry, vol. 68, 2007, 2258-2272. cited by applicant .
Ihemere, Uzoma , "Genetic modification of cassava for enhancedstarch production", Plant Biotechnology Journal, vol. 4, 2006,453-465. cited by applicant .
Jazmin, Lara J., "Isotopically Nonstationary 13C Metabolic FluxAnalysis", Systems Metabolic Engineering: Methods and Protocols,Methods in Molecular Biology, vol. 985, Chapter 18, 2013, 367-390.cited by applicant .
Jazmin, Lara J., "Isotopically Nonstationary MFA (INST-MFA) ofAutotrophic Metabolism", Methods in Molecular Biology, vol. 1090,Chapter 12, 2014, 181-210. cited by applicant .
Johnson, Giles N., "Physiology of PSI cyclic electron transport inhigher plants", Biochimica et Biophysica Acta, vol. 1807, 2011,384-389. cited by applicant .
Joliot, Pierre , et al., "Regulation of cyclic and linear electronflow in higher plants", PNAS, vol. 108, No. 32, 2011, 13317-13322.cited by applicant .
Jonik, Claudia , et al., "Simultaneous boosting of source and sinkcapacities doubles tuber starch yield of potato plants", PlantBiotechnology Journal, vol. 10, 2012, 1088-1098. cited by applicant.
Karlin, Samuel , et al., "Applications and statistics for multiplehigh-scoring segments in molecular sequences", Proc. Natl. Acad.Sci. USA, vol. 90, 1993, 5873-5877. cited by applicant .
Kay, Robert , et al., "Duplication of CaMV 35S promoter sequencescreates a strong enhancer for plant genes", Science, vol. 236,1987, 1299-1302. cited by applicant .
Kelemen, Zsolt , et al., "Transformation vector based on promoterand intron sequences of a replacement histone H3 gene. A tool forhigh, constitutive gene expression in plants", Transgenic Research,vol. 11, 2002, 69-72. cited by applicant .
Kim, Jaoon YH, et al., "Improved production of biohydrogen inlightpowered Escherichia coli by co-expression of proteorhodopsinand heterologous hydrogenase", Microbial Cell Factories, vol. 11,No. 2, 2012, 1-7. cited by applicant .
Kramer, David M., et al., "The Importance of Energy Balance inImproving Photosynthetic Productivity", Plant Physiology, vol. 155,2011, 70-78. cited by applicant .
Kramer, David M., et al., "The Importance of Energy Balance inImproving Photosynthetic Productivity 1[W]", Plant Physiology, vol.155, 2011, 70-78. cited by applicant .
Kuhlemeier, Cris , et al., "The Pea rbcS-3A Promoter Mediates LightResponsiveness but not Organ Specificity", The Plant Cell, vol. 1,1989, 471-478. cited by applicant .
Kyte, Jack , "A simple method for displaying the hydropathiccharacter of a protein", J. Mol. Biol., vol. 157, No. 1, 1982,105-132. cited by applicant .
Lakatos, Melinda , et al., "The Photochemical Reaction Cycle ofProteorhodopsin at Low pH", Biophysical Journal, vol. 84, 2003,3252-3256. cited by applicant .
Leamy, Alexandra , et al., "Modulating lipid fate controlslipotoxicity in palmitate-treated hepatic cells", The FASEBJournal, vol. 27, No. 1, 2013, 1. cited by applicant .
Leamy, Alexandra K., "Molecular mechanisms and the role ofsaturated fatty acids in the progression of non-alcoholic fattyliver disease", Progress in Lipid Research, vol. 52, 2013, 165-174.cited by applicant .
Lindqvist, Annika , et al., "Biochemical Properties of PurifiedRecombinand Human Beta-Carotene 15, 15'-Monooxygenase", The Journalof Biological Chemistry, vol. 277, No. 26, 2002, 23942-23948. citedby applicant .
Liu, Zhan-Bin , et al., "A G-Box-Binding Protein from Soybean Bindsto the E1 Auxin-Response Element in the Soybean CH3 Promoter andContains a Proline-Rich Repression Domain", Plant Physiol., vol.115, 1997, 397-407. cited by applicant .
Lu, Chaofu , et al., "Generation of transgenic plants of apotential oilseed crop Camelina sativa by Agrobacterium-mediatedtransformation", Plant Cell Rep, vol. 27, 2008, 273-278. cited byapplicant .
Ma, Fangfang , et al., "Isotopically nonstationary 13C fluxanalysis of changes in Arabidopsis thaliana leaf metabolism due tohigh light acclimation", PNAS, vol. 111, No. 7, 2014, 16967-16972.cited by applicant .
Mandel, Therese , et al., "Definition of constitutive geneexpression in plants: the translation initiation factor 4A gene asa model", Plant Molecular Biology, vol. 29, 1995, 995-1004. citedby applicant .
Mandy, Dominic E., et al., "Metabolic flux analysis using 13Cpeptide label measurements", The Plant Journal, vol. 77, 2014,476-486. cited by applicant .
Manners, John M., et al., "The promoter of the plant defensin genePDF1.2 from Arabidopsis is systemically activated by fungalpathogens and responds to methyl jasmonate but not to salicylicacid", Plant Molecular Biology, vol. 38, 1998, 1071-1080. cited byapplicant .
Martinez, A. , "Proterhodopsin photosystem gene expression enablesphotophosphorylation in a heterologous host", PNAS, vol. 104, No.13, 2007, 5590-5595. cited by applicant .
Mascarenhas, Desmond , et al., "Intron-mediated enhancement ofheterologous gene expression in maize", Plant Molecular Biology,vol. 15, 1990, 913-920. cited by applicant .
Masclaux-Daubresse, Celine , et al., "Exploring nitrogenremobilization for seed filling using natural variation inArabidopsis thaliana", Journal of Experimental Botany, vol. 62, No.6, 2011, 2131-2142. cited by applicant .
Masgrau, Carles , et al., "Inducible overexpression of oat argininedecarboxylase in transgenic tobacco plants", The Plant Journal,vol. 11, No. 3, 1997, 465-473. cited by applicant .
Mcatee, Allison G., et al., "Role of Chinese hamster ovary centralcarbon metabolism in controlling the quality of secretedbiotherapeutic proteins", Pharm. Bioprocess., vol. 2, No. 1, 2014,63-74. cited by applicant .
Minagawa, Jun , "State transitions--The molecular remodeling ofphotosynthetic supercomplexes that controls energy flow in thechloroplast", Biochimica et Biophysica Acta, vol. 1807, 2011,897-905. cited by applicant .
Miyagawa, Yoshiko , et al., "Overexpression of a cyanobacterialfructose-1,6-/sedoheptulose-1,7-bisphosphatase in tobacco enhancesphotosynthesis and growth", Nature Biotechnology, vol. 19, 2001,965-969. cited by applicant .
Moroney, James V., et al., "Photorespiration and carbonconcentrating mechanisms: two adaptations to high O2, low CO2conditions", Photosynth Res, vol. 117, 2013, 121-131. cited byapplicant .
Nakamura, Naoy , et al., "Promotion of cyclic electron transportaround photosystem I during the evolution of NADP-malic enzyme-typeC4 photosynthesis in the genus Flaveria", New Phytologist, vol.199, 2013, 832-842. cited by applicant .
Odell, Joan T., et al., "Identification of DNA sequences requiredfor activity of the cauliflower mosaic virus 35S promoter", Nature,vol. 313, No. 6005, 1985, 810-812. cited by applicant .
Odell, Joan T., "Seed-Specific Gene Activation Mediated by theCre/lox Site-Specif ic Recombination System", Plant Physiol., vol.106, 1994, 447-458. cited by applicant.
Primary Examiner: Kallis; Russell
Attorney, Agent or Firm: Peacock Law P.C. Vilven; Janeen
Government Interests
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH
This invention was made with government support under grants Nos.DOE-CECO Prime No: DE-AR0000202, Sub No: 21018-N; DOE-CABS PrimeNo: DE-SC0001295, Sub No: 21017-NM NSF EF-1219603, NSF No:1219603.The U.S. government has certain rights in the invention.
Parent Case Text
CROSS-REFERENCE TO RELATED APPLICATIONS
This application is a continuation of U.S. patent application Ser.No. 16/358,331, titled "Improved Carbon Fixation Systems in Plantsand Algae", filed on Mar. 3, 2019, which claims priority to U.S.patent application Ser. No. 15/411,854, entitled "Improved CarbonFixation Systems in Plants and Algae", filed on Jan. 20, 2017, andissued on Mar. 3, 2019 as U.S. Pat. No. 10,233,458, which is acontinuation of International Patent Application No.PCT/US2015/041617, entitled "Improved Carbon Fixation Systems inPlants and Algae", filed on Jul. 22, 2015, which claims priority toand the benefit of the filing of U.S. Provisional PatentApplication No. 62/027,354, entitled "Carbon Fixation Systems inPlants and Algae", filed on Jul. 22, 2014, and the specificationand claims thereof are incorporated herein by reference.
Claims
What is claimed is:
1. A construct comprising: i) a first heterologous nucleic acidsequence comprising a first heterologous polynucleotide sequenceencoding a cyclic electron modulator gene wherein the cyclicelectron modulator is operatively linked to at least one regulatoryelement wherein the first heterologous nucleic acid sequenceencodes a protein having a sequence selected from PRG5 of SEQ IDNO: 1 or PGRL1 of SEQ ID NO: 3 or a sequence with at least 80%sequence homology thereto; and ii) a second heterologous nucleicacid sequence comprising a second heterologous polynucleotidesequence encoding an ATP dependent bicarbonate anion transporterlocalized to the plasma membrane wherein the ATP dependentbicarbonate anion transporter localized to the plasma membrane isoperatively coupled to the at least one regulatory element whereinthe second heterologous nucleic acid sequence encodes a proteinhaving a sequence selected from HLA3 (SEQ ID NO:77) or a sequencewith at least 80% sequence homology thereto.
2. The construct of claim 1 wherein the HLA3 is codon optimized forplant expression.
3. The construct of claim 1 wherein the at least one regulatoryelement includes a promoter.
4. The construct of claim 1 wherein the at least one regulatoryelement is a tissue specific promoter.
5. The construct of claim 1 wherein the at least one regulatoryelement includes a promoter that is a green tissue/leaf-specificpromoter.
6. The construct of claim 1 wherein the promoter is selected fromamong CAB and rbcS.
7. The construct of claim 1 wherein the first nucleic acid sequenceand the second nucleic acid sequence encode: i) the PGR5 protein,and the HLA3 protein; or ii) the PGRL1 protein, and the HLA3protein.
8. The construct of claim 1 further comprising iii) a thirdheterologous nucleic acid sequence comprising a third heterologouspolynucleotide sequence encoding a bicarbonate anion transporterprotein localized to the chloroplast envelope wherein thebicarbonate anion transporter protein localized to the chloroplastenvelope is operatively coupled to the regulatory element whereinthe third heterologous nucleic acid sequence i-encodes a sequenceselected from LCIA of SEQ ID NO: 18 or a sequence with at least 80%sequence homology thereto.
9. The construct of claim 8 wherein the at least one regulatoryelement is a green tissue/leaf-specific promoter.
10. The construct of claim 8 wherein the at least one regulatoryelement includes a promoter selected from among CAB and rbcS.
11. The construct of claim 1 further comprising iii) a thirdheterologous nucleic acid sequence comprising a third heterologouspolynucleotide sequence encoding a carbonic anhydrase wherein thecarbonic anhydrase is operatively coupled to the at least oneregulatory element wherein the third heterologous nucleic acidsequence encodes a protein selected from a human carbonicanhydrase-2 (HCA2) of SEQ ID NO: 18, bacterial Neisseriagonorrhoeae carbonic anhydrase (BCA) of SEQ ID NO: 5 or a sequencewith at least 80% sequence homology thereto.
12. The construct of claim 11 wherein the at least one regulatoryelement includes a green tissue/leaf-specific promoter.
13. The construct of claim 11 wherein the at least one regulatoryelement includes a promoter selected from among CAB and rbcS.
14. The construct of claim 11 wherein the third heterologousnucleic acid sequence encodes the BCA protein of SEQ ID NO: 5protein or a sequence with at least 80% sequence homologythereto.
15. The construct of claim 8 wherein the heterologous nucleotidesequences encode the PGR5 protein, the HLA3 protein, and the LCIAprotein or sequences with at least 80% homology thereto.
16. The construct of claim 11 wherein the heterologous nucleotidesequences encode the PGR5 protein, the HLA3 protein, and the BCAprotein or sequences with at least 80% homology thereto.
17. The construct of claim 11, wherein a) the PGR5 protein has anamino acid sequence at least 80% identical to SEQ ID NO:1; b) theHLA 3 protein has an amino acid sequence at least 80% identical toSEQ ID NO:77; and c) the BCA protein has an amino acid sequence atleast 80% identical to SEQ ID NO:21.
18. The construct of claim 1 wherein the at least one regulatoryelement of the first heterologous nucleic acid sequence and the atleast one regulatory element of the second heterologous nucleicacid sequence includes a promoter which can be the same ordifferent for the first heterologous nucleic acid and the secondheterologous nucleic acid.
19. A seed comprising the construct of claim 1.
20. A vector comprising the construct of claim 1.
Description
SEQUENCE LISTING
The instant application contains a Sequence Listing which has beensubmitted electronically in ASCII format and is hereby incorporatedby reference in its entirety. Said ASCII copy, created on Apr. 5,2017, is named 040517_NMC0001-101-US_Sequence_Listing_ST25.txt andis 286 KB in size.
INCORPORATION BY REFERENCE OF MATERIAL SUBMITTED ON A COMPACTDISC
Not Applicable.
COPYRIGHTED MATERIAL
Not Applicable.
BACKGROUND
A major factor limiting photosynthetic efficiency is thecompetitive inhibition of CO.sub.2 fixation by oxygen, due to lackof specificity of the enzyme RuBisCO. Incorporation of oxygen byRuBisCO is the first-dedicated step in photorespiration, a pathwaythat respires CO.sub.2, compounding photosynthetic inefficiency.Overall, photorespiration reduces photosynthetic productivity by asmuch as 50% [1]. To date, attempts to engineer reduced oxygenaseactivity in RuBisCO have been largely unsuccessful.
Significantly, the cyanobacteria, eukaryotic microalgae, and C4plants have evolved mechanisms to reduce photorespiration byconcentrating CO.sub.2 near RuBisCO, competitively inhibitingoxygenase activity and leading to substantial increases in yieldand water use efficiency per unit carbon fixed. However, carbonconcentrating systems (CCMs) are not operational in the vastmajority of plant species (i.e., C3 plants).
Attempts to reconstitute functional CCMs in C3 plants have beenpreviously attempted by us and others, mainly focusing onengineering pathways that are directly involved in facilitatingCO.sub.2 transport into leaf chloroplasts. Note, for example, PCTInternational Publication WO 2012/125737; Sage and Sage (2009)Plant and Cell Physiol. 50(4):756-772; Zhu et al. (2010) J Interg.Plant Biol. 52(8):762-770; Furbank et al. (2009) Funct. Plant Biol.36(11):845-856; Weber and von Caemmerer (2010) Curr. Opin. PlantBiol.; Price (2013) J. Exp. Bot. 64(3):753-68; and U.S. PatentApplication Publication No. 2013/0007916 A1.
However, ATP and NADPH production through light harvesting andelectron transfer steps must be coordinated with carbonassimilation and additional energy requiring steps including CCMsystems to prevent photoinhibition and to improve growth.Additionally, assimilatory flux and storage rates can limit carbonfixation due to feedback inhibition when sink demand is not matchedto source capacity [2].
Thus, there is a critical need to improve plant productivitythrough integrated systems engineering approaches that balancesource/sink interactions with energy and reductant production todevelop energy-requiring, artificial CCMs that can effectivelymimic those found in nature.
BRIEF SUMMARY OF THE INVENTION
Accordingly, in response to this need, the present disclosureprovides methods for elevating cyclic electron transfer activity,improving carbon concentration, and enhancing carbon fixation in C3and C4 plants, and algae, and producing biomass or other productsfrom C3 or C4 plants, and algae, selected from among, for example,starches, oils, fatty acids, lipids, cellulose or othercarbohydrates, alcohols, sugars, nutraceuticals, pharmaceuticals,fragrance and flavoring compounds, and organic acids, as well astransgenic plants produced thereby. These methods and transgenicplants and algae encompass the expression, or overexpression, ofvarious combinations of genes that improve carbon concentratingsystems in plants and algae, such as bicarbonate transportproteins, carbonic anhydrase, light driven proton pump, cyclicelectron flow regulators, etc. Thus, among its various embodiments,the present disclosure provides the following:
A first embodiment of the present invention provides for atransgenic plant or alga, comprising within its genome, andexpressing or overexpressing, a combination of heterologousnucleotide sequences encoding an ATP dependent bicarbonate aniontransporter localized to the plasma membrane and a cyclic electrontransfer modulator protein. The cyclic electron transfer modulatorprotein may be selected from a PGRL1 protein (for example SEQ IDNO:3), a PGR5 protein (for example SEQ ID NO:1), a leaf FNR1protein (for example SEQ ID NO:96), a leaf FNR2 protein (forexample SEQ ID NO:97), a Fd1 protein (for example SEQ ID NO:95), orany combination thereof and for example the ATP dependentbicarbonate anion transporter localized to the plasma membrane maybe a HLA3 protein (for example SEQ ID NO:77). The transgenic plantor alga described may further comprise within its genome, andexpressing or overexpressing the heterologous nucleotide sequenceencoding a bicarbonate anion transporter protein localized to thechloroplast envelope. The transgenic plant or alga described hereinmay further comprise within its genome, and expressing oroverexpressing the heterologous nucleotide sequence a carbonicanhydrase protein. In a preferred embodiment, the cyclic electrontransfer modulator protein is a PGR5 protein, in another preferredembodiment the cyclic electron transfer modulator protein is Fd1protein, in yet another preferred embodiment, in still anotherpreferred embodiment the cyclic electron transfer modulator proteinis leaf FNR1, in a further preferred embodiment the cyclic electrontransfer modulator protein is PGRL1. In a preferred embodiment theheterologous nucleotide sequences of the transgenic plant or algaencode i) a PGR5 protein, and a HLA3 protein; or ii) a PGR5protein, a HLA3 protein and a PGRL1 protein ora PGR5 protein, aHLA3 protein, and a LCIA protein or a PGR5 protein, a HLA3 protein,a PGRL1 protein, a LCIA protein, and a BCA or HCA2 protein. Inanother preferred embodiment the heterologous nucleotide sequencesthe transgenic plant or alga of wherein encode a PGR5 protein, aHLA3 protein, a LCIA protein and a BCA or optionally a HCA2protein. The transgenic plant or alga as described wherein the PGR5protein has an amino acid sequence at least 80% identical to SEQ IDNO:1; the HLA3 protein has an amino acid sequence at least 80%identical to SEQ ID NO:77; the PGRL1 protein has an amino acidsequence at least 80% identical to SEQ ID NO:3; the LCIA proteinhas an amino acid sequence at least 80% identical to SEQ ID NO:18;and/or the BCA protein has an amino acid sequence at least 80%identical to SEQ ID NO:21. Alternatively, the sequenceidentity/sequence similarity is about 75%, 85%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, 99%, 100% to those specifically disclosedwhich includes for example proteins without a transit peptidesequence and the functional protein.
A second embodiment provides for a transgenic plant or alga,comprising within its genome, and expressing or overexpressing, acombination of heterologous nucleotide sequences encoding:
LCIA protein and BCA protein or HCA protein is provided. In apreferred embodiment the heterologous nucleotide sequences encodetransgenic plant or alga wherein the LCIA protein has an amino acidsequence at least 80% identical to SEQ ID NO:18; and/or the BCAprotein has an amino acid sequence at least 80% identical to SEQ IDNO:21 and the HCA protein has an amino acid sequence at least 80%identical to SEQ ID NO:19. Alternatively, the sequenceidentity/sequence similarity is about 75%, 85%, 90%, 91%, 92%, 93%,94%, 95%, 96%, 97%, 98%, 99%, 100% to those specifically disclosedwhich include for example proteins without a transit peptidesequence and the functional protein.
A third embodiment provides for a transgenic plant or alga,comprising within its genome, and expressing or overexpressing, acombination of heterologous nucleotide sequences encoding an ATPdependent bicarbonate anion transporter localized to the plasmamembrane (for example SEQ ID NO:77), a bicarbonate aniontransporter localized to the chloroplast envelope (for example SEQID NO:18), a carbonic anhydrase, a proteorhodopsin protein targetedto thylakoid membranes (for example SEQ ID NO:98), and a R carotenemonooxygenase protein (for example SEQ ID NO:100). In anotherpreferred embodiment the proteorhodopsin comprises a chloroplasttransit peptide selected from among a psbX stop-transfertrans-membrane domain fused to its C-terminus, a DNAJ transitpeptide, a CAB transit peptide, a PGR5 transit peptide, and a psaDtransit peptide. In another preferred embodiment the.beta.-carotene monooxygenase is expressed under the control of apromoter selected from among an ethanol inducible gene promoter anda green tissue/leaf-specific promoter selected from among CAB andrbcS. The proteorhodopsin may comprise an amino acid substitutionselected from among L219E/T206S, M79T, and M79Y, and combinationsthereof.
The carbonic anhydrase of the first, second, or third embodimentmay be a BCA or optionally a HCA2 protein. The bicarbonate aniontransporter localized to the chloroplast envelope of the first,second and third embodiment may be a LCIA protein. The ATPdependent bicarbonate anion transporter localized to the plasmamembrane of the first and third embodiments may be HLA3.
A fourth embodiment provides for a method of making a transgenicplant or alga of a first embodiment wherein said method comprisesexpressing, or overexpressing, in a C3 plant, a C4 plant, or analga, a combination of heterologous nucleotide sequences encodingan ATP dependent bicarbonate anion transporter localized to theplasma membrane and a cyclic electron transfer modulator protein.The cyclic electron transfer modulator protein may be selected froma PGRL1 protein, a PGR5 protein, a FNR1 protein, a FNR2 protein(leaf-form isotopes), a Fd1 protein, or any combination thereof andwherein the ATP dependent bicarbonate anion transporter localizedto the plasma membrane is a HLA3 protein. The heterologousnucleotide sequences of the fourth embodiment further encoding abicarbonate anion transporter protein localized to the chloroplastenvelope for example the bicarbonate anion transporter protein isLCIA. Additionally, the heterologous nucleotide sequences encode acarbonic anhydrase protein for example a BCA protein or optionallya HCA2 protein. In a preferred embodiment the cyclic electrontransfer modulator protein is a PGR5 protein and optionally a PGRL1protein and or combination thereof.
A fifth embodiment provides a method of making a transgenic plantor alga as described in a second embodiment, wherein said methodcomprises expressing, or overexpressing, in a C3 plant, a C4 plant,or an alga, a combination of heterologous nucleotide sequencesencoding a LCIA protein and a BCA protein or optionally a HCAprotein.
A sixth embodiment provides a method of making a transgenic plantor alga of a third embodiment wherein said method comprisesexpressing, or overexpressing, in a C3 plant, a C4 plant, or analga, a combination of heterologous nucleotide sequences encodingan ATP dependent bicarbonate anion transporter localized to theplasma membrane, a bicarbonate anion transporter, a carbonicanhydrase, a proteorhodopsin protein targeted to thylakoidmembranes, and a R carotene monooxygenase protein. In a preferredembodiment the proteorhodopsin comprises a chloroplast transitpeptide selected from among a psbX stop-transfer trans-membranedomain fused to its C-terminus, a DNAJ transit peptide, a CABtransit peptide, a PGR5 transit peptide, and a psaD transitpeptide. In another preferred embodiment the .beta.-carotenemonooxygenase is expressed under the control of a promoter selectedfrom among an ethanol inducible gene promoter and a greentissue/leaf-specific promoter selected from among CAB and rbcS. Ina preferred embodiment the proteorhodopsin comprises an amino acidsubstitution selected from among L219E/T206S, M79T, and M79Y, andcombinations thereof. In another preferred embodiment the ATPdependent bicarbonate anion transporter localized to the plasmamembrane is HLA3.
The transgenic plant of an embodiment disclosed herein may be a C3plant or a C4 plant such as a transgenic oilseed plant or atransgenic food crop plant which may include the genera Brassica(e.g., rapeseed/canola (Brassica napus; Brassica carinata; Brassicanigra; Brassica oleracea), Camelina, Miscanthus, and Jatropha;Jojoba (Simmondsia chinensis), coconut; cotton; peanut; rice;safflower; sesame; soybean; mustard other than Arabidopsis; wheat;flax (linseed); sunflower; olive; corn; palm; palm kernel;sugarcane; castor bean; switchgrass; Borago officinalis; Echiumplantagineum; Cuphea hookeriana; Cuphea pulcherrima; Cuphealanceolata; Ricinus communis; Coriandrum sativum; Crepis alpina;Vernonia galamensis; Momordica charantia; and Crambe abyssinica,wheat, rice, maize (corn), barley, oats, sorghum, rye, and millet;peanuts, chickpeas, lentils, kidney beans, soybeans, lima beans;potatoes, sweet potatoes, and cassavas; soybeans, corn, canola,peanuts, palm, coconuts, safflower, cottonseed, sunflower, flax,olive, and safflower; sugar cane and sugar beets; bananas, oranges,apples, pears, breadfruit, pineapples, and cherries; tomatoes,lettuce, carrots, melons, strawberry, asparagus, broccoli, peas,kale, cashews, peanuts, walnuts, pistachio nuts, almonds; forageand turf grasses; alfalfa, clover; coffee, cocoa, kola nut, poppy;vanilla, sage, thyme, anise, saffron, menthol, peppermint,spearmint and coriander and preferably wheat, rice and canola. Thetransgenic alga of an embodiment disclosed herein may be selectedfrom among a Chlorella species, a Nannochloropsis species, and aChlamydomonas species. The heterologous nucleotide sequences aredescribed in an embodiment may be codon-optimized for expression insaid transgenic plant or alga. One aspect of the present inventionprovides for a transgenic plant or alga as described in anembodiment which exhibits enhanced CO.sub.2 fixation compared to anotherwise identical control plant grown under the same conditionsfor example wherein CO.sub.2 fixation is enhanced in the range offrom about 10% to about 50% compared to that of an otherwiseidentical control plant grown under the same conditions.
A fourth embodiment provides for a part of said transgenic plant oralga of any embodiment described herein. For example, the part ofsaid transgenic plant may be selected from among a protoplast, acell, a tissue, an organ, a cutting, an explant, a reproductivetissue, a vegetative tissue, biomass, an inflorescence, a flower, asepal, a petal, a pistil, a stigma, a style, an ovary, an ovule, anembryo, a receptacle, a seed, a fruit, a stamen, a filament, ananther, a male or female gametophyte, a pollen grain, a meristem, aterminal bud, an axillary bud, a leaf, a stem, a root, a tuberousroot, a rhizome, a tuber, a stolon, a corm, a bulb, an offset, acell of said plant in culture, a tissue of said plant in culture,an organ of said plant in culture, a callus, propagation materials,germplasm, cuttings, divisions, and propagations.
A fifth embodiment provides for a progeny or derivative of saidtransgenic plant or alga of any embodiment described herein. Forexample, the progeny or derivatives may be selected from amongclones, hybrids, samples, seeds, and harvested material thereof andmay be produced sexually or asexually.
Another embodiment of the present invention provides a method ofelevating CET activity in a C3 plant, C4 plant, or alga whereinsaid method comprises expressing, or overexpressing, in a C3 plant,a C4 plant, or an alga, a combination of heterologous nucleotidesequences encoding an ATP dependent bicarbonate anion transporterlocalized to the plasma membrane and cyclic electron transfermodulator protein.
Yet another embodiment provides a method of enhancing carbonfixation in a C3 plant, C4 plant, or alga wherein said methodcomprises expressing, or overexpressing, in a C3 plant, a C4 plant,or an alga, a combination of heterologous nucleotide sequencesencoding an ATP dependent bicarbonate anion transporter localizedto the plasma membrane and a cyclic electron transfer modulatorprotein.
Yet another method provides for a method of producing biomass orother products from a C3 plant, C4 plant, or an alga, wherein saidproducts are selected from among starches, oils, fatty acids,triacylglycerols, lipids, cellulose or other carbohydrates,alcohols, sugars, nutraceuticals, pharmaceuticals, fragrance andflavoring compounds, and organic acids wherein said methodcomprises expressing, or overexpressing, in a C3 plant, a C4 plant,or an alga, a combination of heterologous nucleotide sequencesencoding an ATP dependent bicarbonate anion transporter localizedto the plasma membrane and a cyclic electron transfer modulatorprotein. This method further comprises growing said plant or algaand harvesting said biomass or recovering said product from saidplant or alga. Another aspect of the present invention provides forbiomass or other product produced from a plant or alga selectedfrom among starches, oils, fatty acids, lipids, cellulose or othercarbohydrates, alcohols, sugars, nutraceuticals, pharmaceuticals,fragrance and flavoring compounds, and organic acids, made by amethod of any one of the method of making a transgenic plant oralga embodiments herein.
Another embodiment provides a method of elevating cyclic electrontransfer (CET) activity in a C3 plant, C4 plant, or alga whereinsaid method comprises expressing, or overexpressing, in a C3 plant,a C4 plant, or an alga, a combination of heterologous nucleotidesequences encoding an ATP dependent bicarbonate anion transporterlocalized to the plasma membrane, a bicarbonate anion transporter,a carbonic anhydrase, a proteorhodopsin protein targeted tothylakoid membranes; and a R carotene monooxygenase protein.
Another embodiment provides a method of enhancing carbon fixationin a C3 plant, C4 plant, or alga wherein said method comprisesexpressing, or overexpressing, in a C3 plant, a C4 plant, or analga, a combination of heterologous nucleotide sequences encodingan ATP dependent bicarbonate anion transporter localized to theplasma membrane, a bicarbonate anion transporter, a carbonicanhydrase, a proteorhodopsin protein targeted to thylakoidmembranes; and a R carotene monooxygenase protein.
Another embodiment provides for a method of producing biomass orother products from a C3 plant, C4 plant, or an alga, wherein saidproducts are selected from among starches, oils, fatty acids,triacylglycerols, lipids, cellulose or other carbohydrates,alcohols, sugars, nutraceuticals, pharmaceuticals, fragrance andflavoring compounds, and organic acids wherein said methodcomprises expressing, or overexpressing, in a C3 plant, a C4 plant,or an alga, a combination of heterologous nucleotide sequencesencoding an ATP dependent bicarbonate anion transporter localizedto the plasma membrane, a bicarbonate anion transporter, a carbonicanhydrase, a proteorhodopsin protein targeted to thylakoidmembranes; and a carotene monooxygenase protein. The method furthercomprises growing said plant or alga and harvesting said biomass orrecovering said product from said plant or alga.
Another embodiment provides for use of a construct comprising oneor more nucleic acids encoding a) a PGR5 protein, and a HLA3protein; b) a PGR5 protein, a HLA3 protein and a PGRL1 protein; c)a PGR5 protein, a HLA3 protein, and a LCIA protein; d) a PGR5protein, a HLA3 protein, a LCIA protein and a BCA or HCA2 protein;e) a PGR5 protein, a HLA3 protein, a PGRL1 protein and a LCIAprotein; f) a PGR5 protein, a HLA3 protein, a PGRL1 protein, a LCIAprotein, and a BCA or HCA2 protein; g) a PGR5 protein, a HLA3protein, and a BCA or HCA2 protein; or h) a PGR5 protein, a HLA3protein, a PGRL1 protein, and a BCA or HCA2 protein for i) making atransgenic plant or alga of a first embodiment; ii) elevating CETactivity in a C3 plant, C4 plant, or alga; iii) enhancing carbonfixation in a C3 plant, C4 plant, or alga; or iv) producing biomassor other products from a C3 plant, C4 plant, or an alga, whereinsaid products are selected from among starches, oils, fatty acids,triacylglycerols, lipids, cellulose or other carbohydrates,alcohols, sugars, nutraceuticals, pharmaceuticals, fragrance andflavoring compounds, and organic acids.
Another embodiment provides for use of a construct comprising oneor more nucleic acids encoding a) a LCIA protein and a BCA or HCA2protein; for i) making a transgenic plant or alga of a secondembodiment; ii) elevating CET activity in a C3 plant, C4 plant, oralga; iii) enhancing carbon fixation in a C3 plant, C4 plant, oralga; or iv) producing biomass or other products from a C3 plant,C4 plant, or an alga, wherein said products are selected from amongstarches, oils, fatty acids, triacylglycerols, lipids, cellulose orother carbohydrates, alcohols, sugars, nutraceuticals,pharmaceuticals, fragrance and flavoring compounds, and organicacids.
One aspect of the present invention provides for a transgenic plantor alga, comprising within its genome, and expressing oroverexpressing, a combination of heterologous nucleotide sequencesencoding:
1. i) a PGRL1 protein, a PGR5 protein, and a HLA3 protein; or ii) aPGRL1 protein, a PGR5 protein, a HLA3 protein, a LCIA protein, anda BCA or HCA2 protein; or iii) a Fd1 protein, a HLA3 protein, aLCIA protein, and a BCA or HCA2 protein; or iv) a leaf FNR1protein, a HLA3 protein, a LCIA protein, and a BCA or HCA2 protein;or v) a proteorhodopsin protein targeted to thylakoid membranes, aHLA3 protein, a LCIA protein, a BCA or HCA2 protein, and a.beta.-carotene monooxygenase.
2. The transgenic plant or alga of 1, wherein said proteorhodopsincomprises a chloroplast transit peptide selected from among a psbXstop-transfer trans-membrane domain fused to its C-terminus, a DNAJtransit peptide, a CAB transit peptide, a PGR5 transit peptide, anda psaD transit peptide.
3. The transgenic plant or alga of 1 or 2, wherein said.beta.-carotene monooxygenase is expressed under the control of apromoter selected from among an ethanol inducible gene promoter anda green tissue/leaf-specific promoter selected from among CAB andrbcS.
4. The transgenic plant or alga of any one of 1-3, wherein saidproteorhodopsin comprises an amino acid substitution selected fromamong L219E/T206S, M79T, and M79Y, and combinations thereof.
5. The transgenic plant of any one of 1-4, which is a C3 plant or aC4 plant.
6. The transgenic plant of any one of 1-5, which is a transgenicoilseed plant or a transgenic food crop plant.
7. The transgenic oilseed plant of 6, which is selected from amongplants of the genera Brassica (e.g., rapeseed/canola (Brassicanapus; Brassica carinata; Brassica nigra; Brassica oleracea),Camelina, Miscanthus, and Jatropha; Jojoba (Simmondsia chinensis),coconut; cotton; peanut; rice; safflower; sesame; soybean; mustardother than Arabidopsis; wheat; flax (linseed); sunflower; olive;corn; palm; palm kernel; sugarcane; castor bean; switchgrass;Borago officinalis; Echium plantagineum; Cuphea hookeriana; Cupheapulcherrima; Cuphea lanceolata; Ricinus communis; Coriandrumsativum; Crepis alpina; Vernonia galamensis; Momordica charantia;and Crambe abyssinica.
8. The transgenic alga of any one of 1-5, which is selected fromamong Chlorella sp., Nannochloropsis sp., and Chlamydomonas sp.
9. The transgenic plant or alga of any one of 1-8, wherein saidheterologous nucleotide sequences are codon-optimized forexpression in said transgenic plant or alga.
10. The transgenic plant or alga of any one of 1-9, which exhibitsenhanced CO.sub.2 fixation compared to an otherwise identicalcontrol plant grown under the same conditions.
11. The transgenic plant or alga of 10, wherein CO.sub.2 fixationis enhanced in the range of from about 10% to about 50% compared tothat of an otherwise identical control plant grown under the sameconditions.
12. A part of said transgenic plant or alga of any one of 1-11.
13. The part of said transgenic plant of 12, which is selected fromamong a protoplast, a cell, a tissue, an organ, a cutting, anexplant, a reproductive tissue, a vegetative tissue, biomass, aninflorescence, a flower, a sepal, a petal, a pistil, a stigma, astyle, an ovary, an ovule, an embryo, a receptacle, a seed, afruit, a stamen, a filament, an anther, a male or femalegametophyte, a pollen grain, a meristem, a terminal bud, anaxillary bud, a leaf, a stem, a root, a tuberous root, a rhizome, atuber, a stolon, a corm, a bulb, an offset, a cell of said plant inculture, a tissue of said plant in culture, an organ of said plantin culture, a callus, propagation materials, germplasm, cuttings,divisions, and propagations.
14. Progeny or derivatives of said transgenic plant or alga of anyone of 1-11.
15. The progeny or derivatives of 14, which is selected from amongclones, hybrids, samples, seeds, and harvested materialthereof.
16. The progeny of 14 or 15, which is produced sexually.
17. The progeny of 14 or 15, which is produced asexually.
Another aspect of the present invention provides for a methodselected from among:
18. i) making a transgenic plant or alga of any one of 1-11;
ii) elevating CET activity in a C3 plant, C4 plant, or alga;
iii) enhancing carbon fixation in a C3 plant, C4 plant, or alga;and
iv) producing biomass or other products from a C3 plant, C4 plant,or alga, wherein said products are selected from among starches,oils, fatty acids, triacylglycerols, lipids, cellulose or othercarbohydrates, alcohols, sugars, nutraceuticals, pharmaceuticals,fragrance and flavoring compounds, and organic acids,
wherein said method comprises expressing, or overexpressing, in aC3 plant, a C4 plant, or an alga, a combination of heterologousnucleotide sequences encoding: a) a PGRL1 protein, a PGR5 protein,and a HLA3 protein; or b) a PGRL1 protein, a PGR5 protein, a HLA3protein, a LCIA protein, and a BCA or HCA2 protein; or c) a Fd1protein, a HLA3 protein, a LCIA protein, and a BCA or HCA2 protein;or d) a leaf FNR1 protein, a HLA3 protein, a LCIA protein, and aBCA or HCA2 protein; or e) a proteorhodopsin protein targeted tothylakoid membranes, a HLA3 protein, a LCIA protein, a BCA or HCA2protein, and a .beta.-carotene monooxygenase.
19. The method of 18, wherein step iv) further comprises growingsaid plant or alga and harvesting said biomass or recovering saidproduct from said plant or alga.
20. The method of 18 or 19, wherein said proteorhodopsin comprisesa chloroplast transit peptide selected from among a psbXstop-transfer trans-membrane domain fused to its C-terminus, a DNAJtransit peptide, a CAB transit peptide, a PGR5 transit peptide, anda psaD transit peptide.
21. The method of any one of 18-20, wherein said .beta.-carotenemonooxygenase is expressed under the control of a promoter selectedfrom among an ethanol inducible gene promoter and a greentissue/leaf-specific promoter selected from among CAB and rbcS.
22. The method of any one of 18-21, wherein said proteorhodopsincomprises an amino acid substitution selected from amongL219E/T206S, M79T, and M79Y, and combinations thereof.
23. The method of any one of 18-22, wherein said transgenic plantis a C3 plant, a C4 plant, or an alga.
24. The method of any one of 18-23, wherein said transgenic plantis a transgenic oilseed plant or a transgenic food crop plant.
25. The method of 24, wherein said transgenic oilseed plant isselected from among plants of the genera Brassica (e.g.,rapeseed/canola (Brassica napus; Brassica carinata; Brassica nigra;Brassica oleracea), Camelina, Miscanthus, and Jatropha; Jojoba(Simmondsia chinensis), coconut; cotton; peanut; rice; safflower;sesame; soybean; mustard other than Arabidopsis; wheat; flax(linseed); sunflower; olive; corn; palm; palm kernel; sugarcane;castor bean; switchgrass; Borago officinalis; Echium plantagineum;Cuphea hookeriana; Cuphea pulcherrima; Cuphea lanceolata; Ricinuscommunis; Coriandrum sativum; Crepis alpina; Vernonia galamensis;Momordica charantia; and Crambe abyssinica.
26. The method of any one of 18-23, wherein said alga is selectedfrom among Chlorella sp., Nannochloropsis sp., and Chlamydomonassp.
27. The method of any one of 18-26, wherein said heterologousnucleotide sequences are codon-optimized for expression in saidtransgenic plant or alga.
28. The method of any one of 18-27, wherein said transgenic plantor alga exhibits enhanced CO.sub.2 fixation compared to anotherwise identical control plant or alga grown under the sameconditions.
29. The method of 28, wherein CO.sub.2 fixation is enhanced in therange of from about 10% to about 50% compared to that of anotherwise identical control plant or alga grown under the sameconditions.
Another aspect of the present invention provides for a transgenicplant or alga made by the method of any one of 18-29.
Yet another aspect of the present invention provides for a biomassor other product from a plant or alga, selected from amongstarches, oils, fatty acids, lipids, cellulose or othercarbohydrates, alcohols, sugars, nutraceuticals, pharmaceuticals,fragrance and flavoring compounds, and organic acids, made by themethod of any one of 18-29.
In addition to the various embodiments listed above, in theExamples below, and in the claims, this disclosure furthervariously encompasses the presently disclosed and claimed CCMprotein combinations in further combinations with the genes andproteins focusing on engineering pathways that are directlyinvolved in facilitating CO.sub.2 transport into leaf chloroplasts,disclosed and claimed in the inventors' previous application PCTInternational Publication WO 2012/125737. The present disclosureencompasses any combination of genes disclosed herein with anycombination of genes disclosed in WO 2012/125737 and in TablesD1-D9 to improve carbon concentrating systems (CCMs) in plants andalgae.
Table D1 represents different classes of .alpha.-CAs found inmammals.
Table D2-D4 represents representative species, Gene bank accessionnumbers, and amino acid sequences for various species of suitableCA genes.
Table D5 represents the codon optimized DNA sequence forchloroplast expression in Chlamydomonas reinhardtii. In Table D5,the underlines sequences represent restriction sites, and baseschanged to optimize chloroplast expression are listed in lowercase. Table D6 provides a breakdown of the number and type of eachcodon optimized.
Representative species and Gene bank accession numbers for variousspecies of bicarbonate transporter are listed below in TablesD8-D9.
Further scope of the applicability of the presently disclosedembodiments will become apparent from the detailed description anddrawing(s) provided below. However, it should be understood thatthe detailed description and specific examples, while indicatingpreferred embodiments of this disclosure, are given by way ofillustration only since various changes and modifications withinthe spirit and scope of these embodiments will become apparent tothose skilled in the art from this detailed description.
BRIEF DESCRIPTION OF THE SEQUENCE LISTING
The disclosure can be more fully understood form the followingdetailed description and the accompanying Sequence Listing, whichform a part of this application.
The sequence descriptions summarize the Sequence Listing attachedhereto. The Sequence Listing contains standard symbols and formatused for nucleotide and amino acid sequence data comply with therules set forth in 37 C.F.R. .sctn. 1.822.
BRIEF DESCRIPTION OF THE DRAWINGS
The above and other aspects, features, and advantages of thepresent disclosure will be better understood from the followingdetailed descriptions taken in conjunction with the accompanyingdrawing(s), all of which are given by way of illustration only, andare not limitative of the presently disclosed embodiments, inwhich:
FIG. 1. Model of the Chlamydomonas CCM showing the localization ofinorganic carbon transporters (HLA3, LCIA) and carbonic anhydrase(CAH: CAH1, CAH3, and CAH6) [5]), and Rubisco. LCIB is an essentialprotein for CCM in Chlamydomonas. It's exact function isunknown.
FIG. 2.(A-B) Growth phenotypes of VVT and HLA3 transgenic (T3)Arabidopsis initially grown on MS media (plus nitrate,NO.sub.3.sup.-). (B) MS media (plus ammonium (NH.sub.4.sup.+) andsucrose) or in soil (ammonium only). X indicates plants died.Numbers refer to plant lines.
FIG. 3.(A-B) Growth phenotypes of VVT and HCA-II transgenic (T1)Arabidopsis 4 weeks after germination. (B) Growth phenotype of VVTArabidopsis (Col-0, left) and the BCA transgenic (T3) (right).
FIG. 4. Photosynthetic assimilation rate of CO.sub.2 in threetransgenic lines (P1, P5, P6) of Arabidopsis expressing BCA(bacterial carbonic anhydrase) measured using a LICOR 6400 gasanalyzer. These lines showed .about.30% increase in theirphotosynthetic efficiency when compared to WT Arabidopsis(Col.-0).
FIG. 5.(A-C) Growth phenotypes of WT and LCIA transgenic (T1)Arabidopsis plants four weeks after germination. (B) Four-week-oldWT (left 4 plants) and independent transgenic Camelina (right 4plants) expressing LCIA. (C) CO.sub.2-dependent photosyntheticrates of WT and LCIA transgenic Camelina.
FIG. 6. Phenotype of HLA3 transgenics grown on nitrate. Energycharge and reductive potential of WT and HLA3 transgenicArabidopsis. Adenylate, nucleotide cofactors, and inorganicphosphate levels measured as nmole/gFW for plants grown on nitrate.Values are averages.+-.SE.
FIG. 7. Photosynthetically active radiation in proteorhodopsinrelative to plant-based chlorophyll [49].
FIG. 8. Plasmid pB110-CAB-PGR5-NOS (Example 1).
FIG. 9. Plasmid pB110-HLA3-pgr5-dsred (Example 1).
FIG. 10. Plasmid pBI 121-CAB1-Tp-NgCAf2-dsred (Example 1).
FIG. 11 illustrates light response curves of Camelina BCAlines.
FIG. 12 illustrates expression of LCIA in Camelina vs WT.
DETAILED DESCRIPTION OF SEVERAL EMBODIMENTS
The following detailed description is provided to aid those skilledin the art in practicing the various embodiments of the presentdisclosure described herein, including all the methods, uses,compositions, etc., described herein. Even so, the followingdetailed description should not be construed to unduly limit thepresent disclosure, as modifications and variations in theembodiments herein discussed may be made by those of ordinary skillin the art without departing from the spirit or scope of thepresent discoveries.
The present disclosure is explained in greater detail below. Thisdisclosure is not intended to be a detailed catalog of all thedifferent ways in which embodiments of this disclosure can beimplemented, or all the features that can be added to the instantembodiments. For example, features illustrated with respect to oneembodiment may be incorporated into other embodiments, and featuresillustrated with respect to a particular embodiment may be deletedfrom that embodiment. In addition, numerous variations andadditions to the various embodiments suggested herein will beapparent to those skilled in the art in light of the instantdisclosure, which variations and additions do not depart from thescope of the instant disclosure. Hence, the following specificationis intended to illustrate some particular embodiments of thedisclosure, and not to exhaustively specify all permutations,combinations, and variations thereof.
Any feature, or combination of features, described herein is(are)included within the scope of the present disclosure, provided thatthe features included in any such combination are not mutuallyinconsistent as will be apparent from the context, thisspecification, and the knowledge of one of ordinary skill in theart. Additional advantages and aspects of the present disclosureare apparent in the following detailed description and claims.
The contents of all publications, patent applications, patents, andother references mentioned herein are incorporated by referenceherein in their entirety. In case of conflict, the presentspecification, including explanations of terms, will control.
Definitions
The following definitions are provided to aid the reader inunderstanding the various aspects of the present disclosure. Unlessdefined otherwise, all technical and scientific terms used hereinhave the same meaning as commonly understood by those of ordinaryskill in the art to which the disclosure pertains.
As used herein and in the appended claims, the singular forms "a","an", and "the" include plural referents unless the context clearlydictates otherwise. Thus, for example, reference to "a plant"includes a plurality of such plants, reference to "a cell" includesone or more cells and equivalents thereof known to those skilled inthe art, and so forth. Similarly, the word "or" is intended toinclude "and" unless the context clearly indicates otherwise. Hence"comprising A or B" means including A, or B, or A and B.Furthermore, the use of the term "including", as well as otherrelated forms, such as "includes" and "included", is notlimiting.
The term "about" as used herein is a flexible word with a meaningsimilar to "approximately" or "nearly". The term "about" indicatesthat exactitude is not claimed, but rather a contemplatedvariation. Thus, as used herein, the term "about" means within 1 or2 standard deviations from the specifically recited value, or .+-.arange of up to 20%, up to 15%, up to 10%, up to 5%, or up to 4%,3%, 2%, or 1% compared to the specifically recited value.
The term "comprising" as used in a claim herein is open-ended, andmeans that the claim must have all the features specificallyrecited therein, but that there is no bar on additional featuresthat are not recited being present as well. The term "comprising"leaves the claim open for the inclusion of unspecified ingredientseven in major amounts. The term "consisting essentially of" in aclaim means that the invention necessarily includes the listedingredients, and is open to unlisted ingredients that do notmaterially affect the basic and novel properties of the invention.A "consisting essentially of" claim occupies a middle groundbetween closed claims that are written in a closed "consisting of"format and fully open claims that are drafted in a "comprising`format". These terms can be used interchangeably herein if, andwhen, this may become necessary. Furthermore, the use of the term"including", as well as other related forms, such as "includes" and"included", is not limiting.
"BCA" refers to bacterial carbonic anhydrase.
"CCMs" and the like refer to carbon concentrating systems.
"CET" refers to cyclic electron transfer.
"LET" refers to linear electron transfer.
"WT" refers to wild-type.
"Cyclic electron transfer modulator protein" refers to any proteinnatural or synthetic that improves the separation of charge acrossthe thylakoid membrane resulting in improved photophosphorylationwith the production of chemical energy. Examples of such modulatorsare the PGR5 and PRGL1 reductases, however improved proteins in theelectron transport chain such as cytochromes, ATPases,ferredoxin-NADP reductase, NAD(P)H-plastoquinone reductase, and thelike are also CET modulator proteins.
Unless otherwise stated, nucleic acid sequences in the text of thisspecification are given, when read from left to right, in the 5' to3' direction. Nucleic acid sequences may be provided as DNA or asRNA, as specified; disclosure of one necessarily defines the other,as is known to one of ordinary skill in the art and is understoodas included in embodiments where it would be appropriate.Nucleotides may be referred to by their commonly acceptedsingle-letter codes. Unless otherwise indicated, amino acidsequences are written left to right in amino to carboxylorientation, respectively. Amino acids may be referred to herein byeither their commonly known three letter symbols or by theone-letter symbols recommended by the IUPAC-IUM BiochemicalNomenclature Commission. It is further to be understood that allbase sizes or amino acid sizes, and all molecular weight ormolecular mass values, given for nucleic acids or polypeptides areapproximate, and are provided for description purposes and are notto be unduly limiting.
Regarding disclosed ranges, the endpoints of all ranges directed tothe same component or property are inclusive and independentlycombinable (e.g., ranges of "up to about 25 wt. %, or, morespecifically, about 5 wt. % to about 20 wt. %," is inclusive of theendpoints and all intermediate values of the ranges of "about 5 wt.% to about 25 wt. %," etc.). Numeric ranges recited with thespecification are inclusive of the numbers defining the range andinclude each integer within the defined range.
As used herein, "altering level of production" or "altering levelof expression" means changing, either by increasing or decreasing,the level of production or expression of a nucleic acid sequence oran amino acid sequence (for example a polypeptide, an siRNA, amiRNA, an mRNA, a gene), as compared to a control level ofproduction or expression.
"Conservative amino acid substitutions": It is well known thatcertain amino acids can be substituted for other amino acids in aprotein structure without appreciable loss of biochemical orbiological activity. Since it is the interactive capacity andnature of a protein that defines that protein's biologicalfunctional activity, certain amino acid sequence substitutions canbe made in a protein sequence, and, of course, its underlying DNAcoding sequence, and nevertheless obtain a protein with likeproperties. Thus, various changes can be made in the amino acidsequences disclosed herein, or in the corresponding DNA sequencesthat encode these amino acid sequences, without appreciable loss oftheir biological utility or activity.
Proteins and peptides biologically functionally equivalent to theproteins and peptides disclosed herein include amino acid sequencescontaining conservative amino acid changes in the fundamental aminoacid sequence. In such amino acid sequences, one or more aminoacids in the fundamental sequence can be substituted, for example,with another amino acid(s), the charge and polarity of which issimilar to that of the native amino acid, i.e., a conservativeamino acid substitution, resulting in a silent change.
It should be noted that there are a number of differentclassification systems in the art that have been developed todescribe the interchangeability of amino acids for one anotherwithin peptides, polypeptides, and proteins. The followingdiscussion is merely illustrative of some of these systems, and thepresent disclosure encompasses any of the "conservative" amino acidchanges that would be apparent to one of ordinary skill in the artof peptide, polypeptide, and protein chemistry from any of thesedifferent systems.
As disclosed in U.S. Pat. No. 5,599,686, certain amino acids in abiologically active peptide, polypeptide, or protein can bereplaced by other homologous, isosteric, and/or isoelectronic aminoacids, wherein the biological activity of the original molecule isconserved in the modified peptide, polypeptide, or protein. Thefollowing list of amino acid replacements is meant to beillustrative and is not limiting:
TABLE-US-00001 Original Replacement Amino Acid Amino Acid(s) AlaGly Arg Lys, ornithine Asn Gln Asp Glu Glu Asp Gln Asn Gly Ala IleVal, Leu, Met, Nle (norleucine) Leu Ile, Val, Met, Nle Lys Arg MetLeu, Ile, Nle, Val Phe Tyr, Trp Ser Thr Thr Ser Trp Phe, Tyr TyrPhe, Trp Val Leu, Ile, Met, Nle
In another system, substitutes for an amino acid within afundamental sequence can be selected from other members of theclass to which the naturally occurring amino acid belongs. Aminoacids can be divided into the following four groups: (1) acidicamino acids; (2) basic amino acids; (3) neutral polar amino acids;and (4) neutral non-polar amino acids. Representative amino acidswithin these various groups include, but are not limited to: (1)acidic (negatively charged) amino acids such as aspartic acid andglutamic acid; (2) basic (positively charged) amino acids such asarginine, histidine, and lysine; (3) neutral polar amino acids suchas glycine, serine, threonine, cysteine, cystine, tyrosine,asparagine. and glutamine; (4) neutral nonpolar (hydrophobic) aminoacids such as alanine, leucine, isoleucine, valine, proline,phenylalanine, tryptophan, and methionine.
Conservative amino acid changes within a fundamental peptide,polypeptide, or protein sequence can be made by substituting oneamino acid within one of these groups with another amino acidwithin the same group.
Some of the other systems for classifying conservative amino acidinterchangeability in peptides, polypeptides, and proteinsapplicable to the sequences of the present disclosure include, forexample, the following:
Functionally defining common properties between individual aminoacids by analyzing the normalized frequencies of amino acid changesbetween corresponding proteins of homologous organisms (Schulz, G.E. and R. H. Schirmer (1979) Principles of Protein Structure(Springer Advanced Texts in Chemistry), Springer-Verlag). Accordingto such analyses, groups of amino acids can be defined where aminoacids within a group exchange preferentially with each other, andtherefore resemble each other most in their impact on overallprotein structure;
Making amino acid changes based on the hydropathic index of aminoacids as described by Kyte and Doolittle (1982) J. Mol. Biol.157(1):105-32. Certain amino acids can be substituted by otheramino acids having a similar hydropathic index or score and stillresult in a protein with similar biological activity, i.e., stillobtain a biological functionally equivalent protein. In making suchchanges, the substitution of amino acids whose hydropathic indicesare within +2 is preferred, those that are within +1 areparticularly preferred, and those within +0.5 are even moreparticularly preferred;
Substitution of like amino acids on the basis of hydrophilicity.U.S. Pat. No. 4,554,101 states that the greatest local averagehydrophilicity of a protein, as governed by the hydrophilicity ofits adjacent amino acids, correlates with a biological property ofthe protein. As detailed in this patent, the followinghydrophilicity values have been assigned to amino acid residues:arginine (+3.0); lysine (+3.0); aspartate (+3.0.+0.1); glutamate(+3.0.+0.1); serine (+0.3); asparagine (+0.2); glutamine (+0.2);glycine (0); threonine (-0.4); proline (-0.5.+0.1); alanine (-0.5);histidine (-0.5); cysteine (-1.0); methionine (-1.3); valine(-1.5); leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3);phenylalanine (-2.5); tryptophan (-3.4). Betts and Russell ((2003),"Amino Acid Properties and Consequences of Substitutions",Bioinformatics for Geneticists, Michael R. Barnes and Ian C. Gray,Eds., John Wiley & Sons, Ltd, Chapter 14, pp. 289-316) reviewthe nature of mutations and the properties of amino acids in avariety of different protein contexts with the purpose of aiding inanticipating and interpreting the effect that a particular aminoacid change will have on protein structure and function. Theauthors point out that features of proteins relevant to consideringamino acid mutations include cellular environments,three-dimensional structure, and evolution, as well as theclassifications of amino acids based on evolutionary, chemical, andstructural principles, and the role for amino acids of differentclasses in protein structure and function in different contexts.The authors note that classification of amino acids into categoriessuch as those shown in FIG. 14.3 of their review, which involvescommon physico-chemical properties, size, affinity for water (polarand non-polar; negative or positive charge), aromaticity andaliphaticity, hydrogen-bonding ability, propensity for sharplyturning regions, etc., makes it clear that reliance on simpleclassifications can be dangerous, and suggests that alternativeamino acids could be engineered into a protein at each position.Criteria for interpreting how a particular mutation might affectprotein structure and function are summarized in section 14.7 ofthis review, and include first inquiring about the protein, andthen about the particular amino acid substitution contemplated.
Biologically/enzymatically functional equivalents of the proteinsand peptides disclosed herein can have 10 or fewer conservativeamino acid changes, more preferably seven or fewer conservativeamino acid changes, and most preferably five or fewer conservativeamino acid changes, i.e., 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1conservative amino acid changes. The encoding nucleotide sequence(e.g., gene, plasmid DNA, cDNA, codon-optimized DNA, or othersynthetic DNA) will thus have corresponding base substitutions,permitting it to code for the biologically functionally equivalentform of protein or peptide. Due to the degeneracy of the geneticcode, i.e., the existence of more than one codon for most of theamino acids naturally occurring in proteins, other DNA (and RNA)sequences that contain essentially the same genetic information asthese nucleic acids, and which encode the same amino acid sequenceas that encoded by these nucleic acids, can be used in the methodsdisclosed herein. This principle applies as well to any of theother nucleotide sequences disclosed herein.
"Control" or "control level" means the level of a molecule, such asa polypeptide or nucleic acid, normally found in nature under acertain condition and/or in a specific genetic background. Incertain embodiments, a control level of a molecule can be measuredin a cell or specimen that has not been subjected, either directlyor indirectly, to a treatment. A control level is also referred toas a wildtype or a basal level. These terms are understood by thoseof ordinary skill in the art. A control plant, i.e. a plant thatdoes not contain a recombinant DNA that confers (for instance) anenhanced trait in a transgenic plant, is used as a baseline forcomparison to identify an enhanced trait in the transgenic plant. Asuitable control plant may be a non-transgenic plant of theparental line used to generate a transgenic plant. A control plantmay in some cases be a transgenic plant line that comprises anempty vector or marker gene, but does not contain the recombinantDNA, or does not contain all of the recombinant DNAs, in the testplant.
The terms "enhance", "enhanced", "increase", or "increased" referto a statistically significant increase. For the avoidance ofdoubt, these terms generally refer to about a 5% increase in agiven parameter or value, about a 10% increase, about a 15%increase, about a 20% increase, about a 25% increase, about a 30%increase, about a 35% increase, about a 40% increase, about a 45%increase, about a 50% increase, about a 55% increase, about a 60%increase, about a 65% increase, about 70% increase, about a 75%increase, about an 80% increase, about an 85% increase, about a 90%increase, about a 95% increase, about a 100% increase, or more overthe control value. These terms also encompass ranges consisting ofany lower indicated value to any higher indicated value, forexample "from about 5% to about 50%", etc.
"Expression" or "expressing" refers to production of a functionalproduct, such as, the generation of an RNA transcript from anintroduced construct, an endogenous DNA sequence, or a stablyincorporated heterologous DNA sequence. A nucleotide encodingsequence may comprise intervening sequence (e.g., introns) or maylack such intervening non-translated sequences (e.g., as in cDNA).Expressed genes include those that are transcribed into mRNA andthen translated into protein and those that are transcribed intoRNA but not translated (for example, siRNA, transfer RNA, andribosomal RNA). The term may also refer to a polypeptide producedfrom an mRNA generated from any of the above DNA precursors. Thus,expression of a nucleic acid fragment, such as a gene or a promoterregion of a gene, may refer to transcription of the nucleic acidfragment (e.g., transcription resulting in mRNA or other functionalRNA) and/or translation of RNA into a precursor or mature protein(polypeptide), or both.
An "expression cassette" refers to a nucleic acid construct, whichwhen introduced into a host cell, results in transcription and/ortranslation of a RNA or polypeptide, respectively.
The term "genome" as it applies to a plant cells encompasses notonly chromosomal DNA found within the nucleus, but organelle DNAfound within subcellular components (e.g., mitochondrial, plastid)of the cell. As used herein, the term "genome" refers to thenuclear genome unless indicated otherwise. However, expression in aplastid genome, e.g., a chloroplast genome, or targeting to aplastid genome such as a chloroplast via the use of a plastidtargeting sequence, is also encompassed by the presentdisclosure.
The term "heterologous" refers to a nucleic acid fragment orprotein that is foreign to its surroundings. In the context of anucleic acid fragment, this is typically accomplished byintroducing such fragment, derived from one source, into adifferent host. Heterologous nucleic acid fragments, such as codingsequences that have been inserted into a host organism, are notnormally found in the genetic complement of the host organism. Asused herein, the term "heterologous" also refers to a nucleic acidfragment derived from the same organism, but which is located in adifferent, e.g., non-native, location within the genome of thisorganism. Thus, the organism can have more than the usual number ofcopy(ies) of such fragment located in its(their) normal positionwithin the genome and in addition, in the case of plant cells,within different genomes within a cell, for example in the nucleargenome and within a plastid or mitochondrial genome as well. Anucleic acid fragment that is heterologous with respect to anorganism into which it has been inserted or transferred issometimes referred to as a "transgene."
A "heterologous" PGRL1 protein or CAB transit peptideprotein-encoding nucleotide sequence, etc., can be one or moreadditional copies of an endogenous PGRL1 protein or CAB transitpeptide protein-encoding nucleotide sequence, or a nucleotidesequence from another plant or other source. PGRL1 is a putativeferredoxin-plastoquinone reductase involved in photosyntheticcyclic electron flow. Furthermore, these can be genomic ornon-genomic nucleotide sequences. Non-genomic nucleotide sequencesencoding such proteins and peptides include, by way of non-limitingexamples, mRNA; synthetically produced DNA including, for example,cDNA and codon-optimized sequences for efficient expression indifferent transgenic plants algae reflecting the pattern of codonusage in such plants; nucleotide sequences encoding the sameproteins or peptides, but which are degenerate in accordance withthe degeneracy of the genetic code; which contain conservativeamino acid substitutions that do not adversely affect theiractivity, etc., as known by those of ordinary skill in the art.
The term "homology" describes a mathematically based comparison ofsequence similarities which is used to identify genes or proteinswith similar functions or motifs. The nucleic acid and proteinsequences of the present invention can be used as a "querysequence" to perform a search against public databases to, forexample, identify other family members, related sequences, orhomologs. The term "homologous" refers to the relationship betweentwo nucleic acid sequence and/or proteins that possess a "commonevolutionary origin", including nucleic acids and/or proteins fromsuperfamilies (e.g., the immunoglobulin superfamily) in the samespecies of animal, as well as homologous nucleic acids and/orproteins from different species of animal (for example, myosinlight chain polypeptide, etc.; see Reeck et al., (1987) Cell,50:667). Such proteins (and their encoding nucleic acids) may havesequence homology, as reflected by sequence similarity, whether interms of percent identity or by the presence of specific residuesor motifs and conserved positions. The methods disclosed hereincontemplate the use of the presently disclosed nucleic and proteinsequences, as well as sequences having sequence identity and/orsimilarity, and similar function.
"Host cell" means a cell which contains a vector and supports thereplication and/or expression of the vector. Host cells may beprokaryotic cells such as E. coli, or eukaryotic cells such asyeast, insect, amphibian, or mammalian cells. Alternatively, thehost cells are monocotyledonous or dicotyledonous plant cells.
The term "introduced" means providing a nucleic acid (e.g., anexpression construct) or protein into a cell. "Introduced" includesreference to the incorporation of a nucleic acid into a eukaryoticor prokaryotic cell where the nucleic acid may be incorporated intothe genome of the cell, and includes reference to the transientprovision of a nucleic acid or protein to the cell. "Introduced"includes reference to stable or transient transformation methods,as well as sexually crossing. Thus, "introduced" in the context ofinserting a nucleic acid fragment (e.g., a recombinant DNAconstruct/expression construct) into a cell, can mean"transfection" or "transformation" or "transduction", and includesreference to the incorporation of a nucleic acid fragment into aeukaryotic or prokaryotic cell where the nucleic acid fragment maybe incorporated into the genome of the cell (e.g., chromosome,plasmid, plastid, or mitochondrial DNA), converted into anautonomous replicon, or transiently expressed (e.g., transfectedmRNA).
The term "isolated" refers to a material such as a nucleic acidmolecule, polypeptide, or small molecule, that has been separatedfrom the environment from which it was obtained. It can also meanaltered from the natural state. For example, a polynucleotide or apolypeptide naturally present in a living animal is not "isolated"but the same polynucleotide or polypeptide separated from thecoexisting materials of its natural state is "isolated", as theterm is employed herein. Thus, a polypeptide or polynucleotideproduced and/or contained within a recombinant host cell isconsidered isolated. Also intended as "isolated polypeptides" or"isolated nucleic acid molecules", etc., are polypeptides ornucleic acid molecules that have been purified, partially orsubstantially, from a recombinant host cell or from a nativesource.
As used herein, "nucleic acid" or "nucleotide sequence" means apolynucleotide (or oligonucleotide), including single ordouble-stranded polymers of deoxyribonucleotide or ribonucleotidebases, and unless otherwise indicated, encompasses naturallyoccurring and synthetic nucleotide analogues having the essentialnature of natural nucleotides in that they hybridize tocomplementary single-stranded nucleic acids in a manner similar tonaturally occurring nucleotides. Nucleic acids may also includefragments and modified nucleotide sequences. Nucleic acidsdisclosed herein can either be naturally occurring, for examplegenomic nucleic acids, or isolated, purified, non-genomic nucleicacids, including synthetically produced nucleic acid sequences suchas those made by solid phase chemical oligonucleotide synthesis,enzymatic synthesis, or by recombinant methods, including forexample, cDNA, codon-optimized sequences for efficient expressionin different transgenic plants reflecting the pattern of codonusage in such plants, nucleotide sequences that differ from thenucleotide sequences disclosed herein due to the degeneracy of thegenetic code but that still encode the protein(s) of interestdisclosed herein, nucleotide sequences encoding the presentlydisclosed protein(s) comprising conservative (or non-conservative)amino acid substitutions that do not adversely affect their normalactivity, PCR-amplified nucleotide sequences, and other non-genomicforms of nucleotide sequences familiar to those of ordinary skillin the art.
The protein-encoding nucleotide sequences, and promoter nucleotidesequences used to drive their expression, disclosed herein can begenomic or non-genomic nucleotide sequences. Non-genomic nucleotideprotein-encoding sequences and promoters include, for example,naturally-occurring mRNA, synthetically produced mRNA,naturally-occurring DNA, or synthetically produced DNA. Syntheticnucleotide sequences can be produced by means well known in theart, including by chemical or enzymatic synthesis ofoligonucleotides, and include, for example, cDNA, codon-optimizedsequences for efficient expression in different transgenic plantsand algae reflecting the pattern of codon usage in such organisms,variants containing conservative (or non-conservative) amino acidsubstitutions that do not adversely affect their normal activity,PCR-amplified nucleotide sequences, etc.
"A PGRL1 protein", "a PGR5 protein", "a HLA3 protein", "a CABtransit peptide", "a PGR5 transit peptide", or any other protein orpeptide presently broadly disclosed and utilized in any of the CCMmethods and plants and algae disclosed herein refers to a proteinor peptide exhibiting enzymatic/functional activity similar oridentical to the enzymatic/functional activity of the specificallynamed protein or peptide. Enzymatic/functional activities of theproteins and peptides disclosed herein are described below."Similar" enzymatic/functional activity of a protein or peptide canbe in the range of from about 75% to about 125% or more of theenzymatic/functional activity of the specifically named protein orpeptide when equal amounts of both proteins or peptides areassayed, tested, or expressed as described below under identicalconditions, and can therefore be satisfactorily substituted for thespecifically named proteins or peptides in the present enhanced CCMmethods and transgenic plants and algae.
"Nucleic acid construct" or "construct" refers to an isolatedpolynucleotide which can be introduced into a host cell. Thisconstruct may comprise any combination of deoxyribonucleotides,ribonucleotides, and/or modified nucleotides. This construct maycomprise an expression cassette that can be introduced into andexpressed in a host cell.
"Operably linked" refers to a functional arrangement of elements. Afirst nucleic acid sequence is operably linked with a secondnucleic acid sequence when the first nucleic acid sequence isplaced in a functional relationship with the second nucleic acidsequence. For instance, a promoter is operably linked to a codingsequence if the promoter effects the transcription or expression ofthe coding sequence. The control elements need not be contiguouswith the coding sequence, so long as they function to direct theexpression thereof. Thus, for example, intervening untranslated yettranscribed sequences can be present between a promoter and thecoding sequence and the promoter can still be considered "operablylinked" to the coding sequence.
The terms "plant" or "plants" that can be used in the presentmethods broadly include the classes of higher and lower plantsamenable to transformation techniques, including angiosperms(monocotyledonous and dicotyledonous plants), gymnosperms, ferns,and unicellular and multicellular algae. The term "plant" alsoincludes plants which have been modified by breeding, mutagenesis,or genetic engineering (transgenic and non-transgenic plants). Itincludes plants of a variety of ploidy levels, including aneuploid,polyploid, diploid, haploid, and hemizygous. The plant may be inany form including suspension cultures, embryos, meristematicregions, callus tissue, gametophytes, sporophytes, pollen,microspores, whole plants, shoot vegetative organs/structures (e.g.leaves, stems and tubers), roots, flowers and floralorgans/structures, seed (including embryo, endosperm, and seedcoat) and fruit, plant tissue (e.g. vascular tissue, ground tissue,and the like) and cells, and progeny of same.
Embodiments of the present disclosure also include parts of plantsor algae, which can be selected from among a protoplast, a cell, atissue, an organ, a cutting, an explant, a reproductive tissue, avegetative tissue, biomass, an inflorescence, a flower, a sepal, apetal, a pistil, a stigma, a style, an ovary, an ovule, an embryo,a receptacle, a seed, a fruit, a stamen, a filament, an anther, amale or female gametophyte, a pollen grain, a meristem, a terminalbud, an axillary bud, a leaf, a stem, a root, a tuberous root, arhizome, a tuber, a stolon, a corm, a bulb, an offset, a cell ofsaid plant in culture, a tissue of said plant in culture, an organof said plant in culture, a callus, propagation materials,germplasm, cuttings, divisions, and propagations.
Other embodiments include progeny or derivatives of transgenicplants and algae disclosed herein selected, for example, from amongclones, hybrids, samples, seeds, and harvested material. Progenycan be asexually or sexually produced by methods well known in theart.
Useful C3 and C4 Plants
Plants to which the methods disclosed herein can be advantageouslyapplied include both C3 and C4 plants, including "food crop" and"oilseed" plants, as well as algae.
Food Crop Plants
The term "food crop plant" refers to plants that are eitherdirectly edible, or which produce edible products, and that arecustomarily used to feed humans either directly, or indirectlythrough animals. Non-limiting examples of such plants include:
1. Cereal crops: wheat, rice, maize (corn), barley, oats, sorghum,rye, and millet;
2. Protein crops: peanuts, chickpeas, lentils, kidney beans,soybeans, lima beans;
3. Roots and tubers: potatoes, sweet potatoes, and cassavas;
4. Oil crops: soybeans, corn, canola, peanuts, palm, coconuts,safflower, cottonseed, sunflower, flax, olive, and safflower;
5. Sugar crops: sugar cane and sugar beets;
6. Fruit crops: bananas, oranges, apples, pears, breadfruit,pineapples, and cherries;
7. Vegetable crops and tubers: tomatoes, lettuce, carrots, melons,asparagus, etc.
8. Nuts: cashews, peanuts, walnuts, pistachio nuts, almonds;
9. Forage and turf grasses;
10. Forage legumes: alfalfa, clover;
11. Drug crops: coffee, cocoa, kola nut, poppy;
12. Spice and flavoring crops: vanilla, sage, thyme, anise,saffron, menthol, peppermint, spearmint, coriander.
In certain embodiments of this disclosure, the food crop plants aresoybean, canola, tomato, potato, cassava, wheat, rice, oats,lettuce, broccoli, beets, sugar beets, beans, peas, kale,strawberry, and peanut.
"Oilseed Plants", "Oil Crop Plants", "Biofuels Crops", "EnergyCrops"
The terms "oilseed plant" or "oil crop plant", and the like, towhich the present methods and compositions can also be applied,refer to plants that produce seeds or fruit with oil content in therange of from about 1 to 2%, e.g., wheat, to about 20%, e.g.,soybeans, to over 40%, e.g., sunflowers and rapeseed (canola).These include major and minor oil crops, as well as wild plantspecies which are used, or are being investigated and/or developed,as sources of biofuels due to their significant oil production andaccumulation.
Exemplary oil seed or oil crop plants useful in practicing themethods disclosed herein include, but are not limited to, plants ofthe genera Brassica (e.g., rapeseed/canola (Brassica napus;Brassica carinata; Brassica nigra; Brassica oleracea), Camelina,Miscanthus, and Jatropha; Jojoba (Simmondsia chinensis), coconut;cotton; peanut; rice; safflower; sesame; soybean; mustard; wheat;flax (linseed); sunflower; olive; corn; palm; palm kernel;sugarcane; castor bean; switchgrass; Borago officinalis; Echiumplantagineum; Cuphea hookeriana; Cuphea pulcherrima; Cuphealanceolata; Ricinus communis; Coriandrum sativum; Crepis alpina;Vernonia galamensis; Momordica charantia; and Crambeabyssinica.
A non-limiting example of a tuber that accumulates significantamounts of reserve lipids is the tuber of Cyperus esculentus (chufaor tigernuts), which has been proposed as an oil crop for biofuelproduction. In the case of chufa, use of a constitutive ortuber-specific promoter would be useful in the methods disclosedherein.
Useful Algae
Algae useful in practicing various methods of the presentdisclosure include members of the following divisions: Chlorophytaand Heterokontophyta.
In certain embodiments, useful algae include members of thefollowing classes: Chlorophyceae, Bacillariophyceae,Eustigmatophyceae, and Chrysophyceae. In certain embodiments,useful algae include members of the following genera:Nannochloropsis, Chlorella, Dunaliella, Scenedesmus, Selenastrum,Oscillatoria, Phormidium, Spirulina, Amphora, and Ochromonas. Inone embodiment, members of the genus Chlorella are preferred.
Some algal species of particular interest include, withoutlimitation: Bacillariophyceae strains, Chlorophyceae, Cyanophyceae,Xanthophyceae, Chrysophyceae, Chlorella, Crypthecodinium,Schizocytrium, Nannochloropsis, Ulkenia, Dunaliella, Cyclotella,Navicula, Nitzschia, Cyclotella, Phaeodactylum, andThaustochytrid.
Non-limiting examples of algae species that can be used with themethods of the present disclosure include, for example, Achnanthesorientalis, Agmenellum spp., Amphiprora hyaline, Amphoracoffeiformis, Amphora coffeiformis var. linea, Amphora coffeiformisvar. punctata, Amphora coffeiformis var. taylori, Amphoracoffeiformis var. tenuis, Amphora delicatissima, Amphoradelicatissima var. capitata, Amphora sp., Anabaena, Ankistrodesmus,Ankistrodesmus falcatus, Boekelovia hooglandii, Borodinella sp.,Botryococcus braunii, Botryococcus sudeticus, Bracteococcus minor,Bracteococcus medionucleatus, Carteria, Chaetoceros gracilis,Chaetoceros muelleri, Chaetoceros muelleri var. subsalsum,Chaetoceros sp., Chlamydomas perigranulata, Chlorella anitrata,Chlorella antarctica, Chlorella aureoviridis, Chlorella Candida,Chlorella capsulate, Chlorella desiccate, Chlorella ellipsoidea,Chlorella emersonii, Chlorella fusca, Chlorella fusca var.vacuolata, Chlorella glucotropha, Chlorella infusionum, Chlorellainfusionum var. actophila, Chlorella infusionum var. auxenophila,Chlorella kessleri, Chlorella lobophora, Chlorella luteoviridis,Chlorella luteoviridis var. aureoviridis, Chlorella luteoviridisvar. lutescens, Chlorella miniata, Chlorella minutissima, Chlorellamutabilis, Chlorella nocturna, Chlorella ovalis, Chlorella parva,Chlorella photophila, Chlorella pringsheimii, Chlorellaprotothecoides, Chlorella protothecoides var. acidicola, Chlorellaregularis, Chlorella regularis var. minima, Chlorella regularisvar. umbricata, Chlorella reisiglii, Chlorella saccharophila,Chlorella saccharophila var. ellipsoidea, Chlorella salina,Chlorella simplex, Chlorella sorokiniana, Chlorella sp., Chlorellasphaerica, Chlorella stigmatophora, Chlorella vanniellii, Chlorellavulgaris, Chlorella vulgaris fo. tertia, Chlorella vulgaris var.autotrophica, Chlorella vulgaris var. viridis, Chlorella vulgarisvar. vulgaris, Chlorella vulgaris var. vulgaris fo. tertia,Chlorella vulgaris var. vulgaris fo. viridis, Chlorella xanthella,Chlorella zofingiensis, Chlorella trebouxioides, Chlorellavulgaris, Chlorococcum infusionum, Chlorococcum sp., Chlorogonium,Chroomonas sp., Chrysosphaera sp., Cricosphaera sp.,Crypthecodinium cohnii, Cryptomonas sp., Cyclotella cryptica,Cyclotella meneghiniana, Cyclotella sp., Chlamydomonas moewusiiChlamydomonas reinhardtii Chlamydomonas sp. Dunaliella sp.,Dunaliella bardawil, Dunaliella bioculata, Dunaliella granulate,Dunaliella maritime, Dunaliella minuta, Dunaliella parva,Dunaliella peircei, Dunaliella primolecta, Dunaliella salina,Dunaliella terricola, Dunaliella tertiolecta, Dunaliella viridis,Dunaliella tertiolecta, Eremosphaera viridis, Eremosphaera sp.,Ellipsoidon sp., Euglena spp., Franceia sp., Fragilariacrotonensis, Fragilaria sp., Gleocapsa sp., Gloeothamnion sp.,Haematococcus pluvialis, Hymenomonas sp., Isochrysis aff. galbana,Isochrysis galbana, Lepocinclis, Micractinium, Micractinium,Monoraphidium minutum, Monoraphidium sp., Nannochloris sp.,Nannochloropsis salina, Nannochloropsis sp., Navicula acceptata,Navicula biskanterae, Navicula pseudotenelloides, Naviculapelliculosa, Navicula saprophila, Navicula sp., Nephrochloris sp.,Nephroselmis sp., Nitschia communis, Nitzschia alexandrina,Nitzschia closterium, Nitzschia communis, Nitzschia dissipata,Nitzschia frustulum, Nitzschia hantzschiana, Nitzschia inconspicua,Nitzschia intermedia, Nitzschia microcephala, Nitzschia pusilla,Nitzschia pusilla elliptica, Nitzschia pusilla monoensis, Nitzschiaquadrangular, Nitzschia sp., Ochromonas sp., Oocystis parva,Oocystis pusilla, Oocystis sp., Oscillatoria limnetica,Oscillatoria sp., Oscillatoria subbrevis, Parachlorella kessleri,Pascheria acidophila, Pavlova sp., Phaeodactylum tricomutum,Phagus, Phormidium, Platymonas sp., Pleurochrysis carterae,Pleurochrysis dentate, Pleurochrysis sp., Prototheca wickerhamii,Prototheca stagnora, Prototheca portoricensis, Protothecamoriformis, Prototheca zopfii, Pseudochlorella aquatica,Pyramimonas sp., Pyrobotrys, Rhodococcus opacus, Sarcinoidchrysophyte, Scenedesmus armatus, Schizochytrium, Spirogyra,Spirulina platensis, Stichococcus sp., Synechococcus sp.,Synechocystisf, Tagetes erecta, Tagetes patula, Tetraedron,Tetraselmis sp., Tetraselmis suecica, Thalassiosira weissflogii,and Viridiella fridericiana.
In certain embodiments of this disclosure, the algae are species ofChlorella, Nannochloropsis, and Chlamydomonas listed above.
Exemplary food crop plant include wheat, rice, maize (corn),barley, oats, sorghum, rye, and millet; peanuts, chickpeas,lentils, kidney beans, soybeans, lima beans; potatoes, sweetpotatoes, and cassavas; soybeans, corn, canola, peanuts, palm,coconuts, safflower, cottonseed, sunflower, flax, olive, andsafflower; sugar cane and sugar beets; bananas, oranges, apples,pears, breadfruit, pineapples, and cherries; tomatoes, lettuce,carrots, melons, strawberry, asparagus, broccoli, peas, kale,cashews, peanuts, walnuts, pistachio nuts, almonds; forage and turfgrasses; alfalfa, clover; coffee, cocoa, kola nut, poppy; vanilla,sage, thyme, anise, saffron, menthol, peppermint, spearmint andcoriander and preferably wheat, rice and canola.
The terms "peptide", "polypeptide", and "protein" are used to referto polymers of amino acid residues. These terms are specificallyintended to cover naturally occurring biomolecules, as well asthose that are recombinantly or synthetically produced, for exampleby solid phase synthesis.
The term "promoter" or "regulatory element" refers to a region ornucleic acid sequence located upstream or downstream from the startof transcription and which is involved in recognition and bindingof RNA polymerase and/or other proteins to initiate transcriptionof RNA. Promoters need not be of plant or algal origin. Forexample, promoters derived from plant viruses, such as the CaMV35Spromoter, or from other organisms, can be used in variations of theembodiments discussed herein. Promoters useful in the presentmethods include, for example, constitutive, strong, weak,tissue-specific, cell-type specific, seed-specific, inducible,repressible, and developmentally regulated promoters.
A skilled person appreciates that a promoter sequence can bemodified to provide for a range of expression levels of an operablylinked heterologous nucleic acid molecule. Less than the entirepromoter region can be utilized and the ability to drive expressionretained. However, it is recognized that expression levels of mRNAcan be decreased with deletions of portions of the promotersequence. Thus, the promoter can be modified to be a weak or strongpromoter. A promoter is classified as strong or weak according toits affinity for RNA polymerase (and/or sigma factor); this isrelated to how closely the promoter sequence resembles the idealconsensus sequence for the polymerase. Generally, by "weakpromoter" is intended a promoter that drives expression of a codingsequence at a low level. By "low level" is intended levels of about1/10,000 transcripts to about 1/100,000 transcripts to about1/500,000 transcripts. Conversely, a strong promoter drivesexpression of a coding sequence at a high level, or at about 1/10transcripts to about 1/100 transcripts to about 1/1,000transcripts. The promoter of choice is preferably excised from itssource by restriction enzymes, but can alternatively bePCR-amplified using primers that carry appropriate terminalrestriction sites. It should be understood that the foregoinggroups of promoters are non-limiting, and that one skilled in theart could employ other promoters that are not explicitly citedherein.
The term "purified" refers to material such as a nucleic acid, aprotein, or a small molecule, which is substantially or essentiallyfree from components which normally accompany or interact with thematerial as found in its naturally occurring environment, and/orwhich may optionally comprise material not found within thepurified material's natural environment. The latter may occur whenthe material of interest is expressed or synthesized in anon-native environment. Nucleic acids and proteins that have beenisolated include nucleic acids and proteins purified by standardpurification methods. The term also encompasses nucleic acids andproteins prepared by recombinant expression in a host cell as wellas chemically synthesized nucleic acids.
"Recombinant" refers to a nucleotide sequence, peptide,polypeptide, or protein, expression of which is engineered ormanipulated using standard recombinant methodology. This termapplies to both the methods and the resulting products. As usedherein, a "recombinant construct", "expression construct","chimeric construct", "construct" and "recombinant expressioncassette" are used interchangeably herein.
As used herein, the phrase "sequence identity" or "sequencesimilarity" is the similarity between two (or more) nucleic acidsequences, or two (or more) amino acid sequences. Sequence identityis frequently measured as the percent of identical nucleotide oramino acid residues at corresponding positions in two or moresequences when the sequences are aligned to maximize sequencematching, i.e., taking into account gaps and insertions.
One of ordinary skill in the art will appreciate that sequenceidentity ranges are provided for guidance only. It is entirelypossible that nucleic acid sequences that do not show a high degreeof sequence identity can nevertheless encode amino acid sequenceshaving similar functional activity. It is understood that changesin nucleic acid sequence can be made using the degeneracy of thegenetic code to produce multiple nucleic acid molecules that allencode substantially the same protein. Means for making thisadjustment are well-known to those of skill in the art. Whenpercentage of sequence identity is used in reference to amino acidsequences it is recognized that residue positions which are notidentical often differ by conservative amino acid substitutions,where amino acid residues are substituted for other amino acidresidues with similar chemical properties (e.g., charge orhydrophobicity) and therefore do not change the functionalproperties of the molecule. Where sequences differ in conservativesubstitutions, the percent sequence identity may be adjustedupwards to correct for the conservative nature of the substitution.Sequences which differ by such conservative substitutions are saidto have "sequence similarity" or "similarity". Means for makingthis adjustment are well-known to those of skill in the art.Typically this involves scoring a conservative substitution as apartial rather than a full mismatch, thereby increasing thepercentage sequence identity.
"Percentage of sequence identity" is determined by comparing twooptimally aligned sequences over a comparison window, wherein theportion of the polynucleotide sequence in the comparison window maycomprise additions or deletions (i.e., gaps) as compared to thereference sequence (which does not comprise additions or deletions)for optimal alignment of the two sequences. The percentage iscalculated by determining the number of positions at which theidentical nucleic acid base or amino acid residue occurs in bothsequences to yield the number of matched positions, dividing thenumber of matched positions by the total number of positions in thewindow of comparison and multiplying the result by 100 to yield thepercentage of sequence identity.
Sequence identity (or similarity) can be readily calculated byknown methods, including but not limited to those described in:Computational Molecular Biology, Lesk, A. M., ed., OxfordUniversity Press, New York, 1988; Biocomputing: Informatics andGenome Projects, Smith, D. W., ed., Academic Press, New York, 1993;Computer Analysis of Sequence Data, Part I, Griffin, A. M., andGriffin, H. G., eds., Humana Press, New Jersey, 1994; SequenceAnalysis in Molecular Biology, von Heinje, G., Academic Press,1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, J.,eds., M Stockton Press, New York, 1991; and Carillo, H., andLipman, D., SIAM J. Applied Math., 48: 1073 (1988). Methods todetermine identity are designed to give the largest match betweenthe sequences tested. Moreover, methods to determine identity arecodified in publicly available computer programs. Optimal alignmentof sequences for comparison can be conducted, for example, by thelocal homology algorithm of Smith & Waterman, by the homologyalignment algorithms, by the search for similarity method or, bycomputerized implementations of these algorithms (GAP, BESTFIT,PASTA, and TFASTA in the GCG Wisconsin Package, available fromAccelrys, Inc., San Diego, Calif., United States of America), or byvisual inspection. See generally, (Altschul, S. F. et al., J. Mol.Biol. 215: 403-410 (1990) and Altschul et al. Nucl. Acids Res. 25:3389-3402 (1997)).
One example of an algorithm that is suitable for determiningpercent sequence identity and sequence similarity is the BLASTalgorithm, which is described in (Altschul, S., et al., NCBI NLMNIH Bethesda, Md. 20894; & Altschul, S., et al., J. Mol. Biol.215: 403-410 (1990). Software for performing BLAST analyses ispublicly available through the National Center for BiotechnologyInformation. This algorithm involves first identifying high scoringsequence pairs (HSPs) by identifying short words of length Win thequery sequence, which either match or satisfy some positive-valuedthreshold score T when aligned with a word of the same length in adatabase sequence. T is referred to as the neighborhood word scorethreshold. These initial neighborhood word hits act as seeds forinitiating searches to find longer HSPs containing them. The wordhits are then extended in both directions along each sequence foras far as the cumulative alignment score can be increased.Cumulative scores are calculated using, for nucleotide sequences,the parameters M (reward score for a pair of matching residues;always >0) and N (penalty score for mismatching residues; always<0). For amino acid sequences, a scoring matrix is used tocalculate the cumulative score. Extension of the word hits in eachdirection are halted when: the cumulative alignment score falls offby the quantity X from its maximum achieved value, the cumulativescore goes to zero or below due to the accumulation of one or morenegative-scoring residue alignments, or the end of either sequenceis reached. The BLAST algorithm parameters W, T, and X determinethe sensitivity and speed of the alignment. The BLASTN program (fornucleotide sequences) uses as defaults a wordlength (W) of 11, anexpectation (E) of 10, a cutoff of 100, M=5, N=-4, and a comparisonof both strands. For amino acid sequences, the BLASTP program usesas defaults a wordlength (W) of 3, an expectation (E) of 10, andthe BLOSUM62 scoring matrix (see Henikoff & Henikoff (1989)Proc. Natl. Acad. Sci. USA 89:10915).
In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similaritybetween two sequences (see, e.g., Karlin & Altschul, Proc.Nat'l. Acad. Sci. USA 90: 5873-5877 (1993)). One measure ofsimilarity provided by the BLAST algorithm is the smallest sumprobability (P (N)), which provides an indication of theprobability by which a match between two nucleotide or amino acidsequences would occur by chance. BLAST searches assume thatproteins can be modeled as random sequences. However, many realproteins comprise regions of nonrandom sequences which may behomopolymeric tracts, short-period repeats, or regions enriched inone or more amino acids. Such low-complexity regions may be alignedbetween unrelated proteins even though other regions of the proteinare entirely dissimilar. A number of low-complexity filter programscan be employed to reduce such low-complexity alignments. Forexample, the SEG (Wooten and Federhen, Comput. Chem., 17: 149-163(1993)) and XNU (Claverie and States, Comput. Chem., 17: 191-201(1993)) low-complexity filters can be employed alone or incombination.
The constructs and methods disclosed herein encompass nucleic acidand protein sequences having sequence identity/sequence similarityat least about 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%,97%, 98%, 99%, 100% to those specifically and/or sequences havingthe same or similar function for example if a protein or nucleicacid is identified with a transit peptide and the transit peptideis cleaved leaving the protein sequence without the transit peptidethen the sequence identity/sequence similarity is compared to theprotein with and/or without the transit peptide.
A "transgenic" organism, such as a transgenic plant, is a hostorganism that has been stably or transiently genetically engineeredto contain one or more heterologous nucleic acid fragments,including nucleotide coding sequences, expression cassettes,vectors, etc. Introduction of heterologous nucleic acids into ahost cell to create a transgenic cell is not limited to anyparticular mode of delivery, and includes, for example,microinjection, floral dip, adsorption, electroporation, vacuuminfiltration, particle gun bombardment, whiskers-mediatedtransformation, liposome-mediated delivery, Agrobacterium-mediatedtransfer, the use of viral and retroviral vectors, etc., as is wellknown to those skilled in the art.
Conventional techniques of molecular biology, recombinant DNAtechnology, microbiology, and chemistry useful in practicing themethods of the present disclosure are described, for example, inGreen and Sambrook (2012) Molecular Cloning: A Laboratory Manual,Fourth Edition, Cold Spring Harbor Laboratory Press; Ausubel et al.(2003 and periodic supplements) Current Protocols in MolecularBiology, John Wiley & Sons, New York, N.Y.; Amberg et al.(2005) Methods in Yeast Genetics: A Cold Spring Harbor LaboratoryCourse Manual, 2005 Edition, Cold Spring Harbor Laboratory Press;Roe et al. (1996) DNA Isolation and Sequencing: EssentialTechniques, John Wiley & Sons; J. M. Polak and James O'D. McGee(1990) In Situ Hybridization: Principles and Practice; OxfordUniversity Press; M. J. Gait (Editor) (1984) OligonucleotideSynthesis: A Practical Approach, IRL Press; D. M. J. Lilley and J.E. Dahlberg (1992) Methods in Enzymology: DNA Structure Part A:Synthesis and Physical Analysis of DNA, Academic Press; and LabRef: A Handbook of Recipes, Reagents, and Other Reference Tools forUse at the Bench, Edited by Jane Roskams and Linda Rodgers (2002)Cold Spring Harbor Laboratory Press; Burgess and Deutscher (2009)Guide to Protein Purification, Second Edition (Methods inEnzymology, Vol. 463), Academic Press. Note also U.S. Pat. Nos.8,178,339; 8,119,365; 8,043,842; 8,039,243; 7,303,906; 6,989,265;US20120219994A1; and EP1483367B1. The entire contents of each ofthese texts and patent documents are herein incorporated byreference.
Preliminary Results: Transgenic Plants Expressing Algal CCMGenes
Previously, reconstitution of a functional inorganic CCM in C3plants to suppress photo-respiration and enhance photosynthesis wasproposed. In WO 2012/125737, it was hypothesized that expression ofa minimum of three algal CCM proteins would be sufficient toelevate internal plastid CO.sub.2 concentrations high enough tosuppress photorespiration. These three algal CCM genes included theChlamydomonas plasma membrane-localized and ATP-dependentbicarbonate transporter, HLA3; the chloroplast envelope localizedbicarbonate anion transporter, LCIA; and a chloroplaststromal-localized carbonic anhydrase (HCA-II) to accelerateconversion of bicarbonate into CO.sub.2. These genes haveindividually been shown to be important to the CCM in prior studies([3-5]). To test this hypothesis, we generated multiple independenttransgenic Arabidopsis and Camelina plants expressing each CCM geneas a single gene construct, as well as a stacked 3-gene construct.The expression of each gene was controlled by the light-regulatedCab1 gene promoter [6].
The results of phenotypic analyses of Arabidopsis and Camelinaplants transformed with the single CCM gene constructs were asfollows:
HLA3 Arabidopsis transgenics varied in their phenotypes, butgenerally had reduced growth phenotypes relative to wild-type (WT)plants (FIG. 6). When the same plasmid was used to transformCamelina, no viable seeds were recovered from any transformationevent after multiple attempts, indicating that HLA3 expression waslikely toxic to Camelina.
With respect to carbonic anhydrase (CA) transgenics, we expressed ahuman carbonic anhydrase-2 (HCA2 (SEQ ID NO:17)) or a bacterialNeisseria gonorrhoeae carbonic anhydrase (BCA SEQ ID NO: 4)) in thechloroplast stroma [7]. We choose these CAs because each has aturnover number (Kcat=106 sec-1) that is approximately 10.times.faster than plant/algal Cas. In both Arabidopsis and Camelina, weobserved phenotypes that were either similar to WT (HCA2) orsubstantially larger (BCA) than WT plants (FIG. 3B).
Transgenic Arabidopsis plants expressing the LCIA gene weresubstantially impaired in growth (FIG. 5A). In contrast, CamelinaLCIA transgenics grew better than WT, had up to 25% higherphotosynthetic rates at ambient CO.sub.2 concentrations, and hadreduced CO.sub.2 compensation points (FIG. 5B).
The fact that expression of individual CCM genes impaired growth inC3 plants suggested that additional traits may need to be expressedor silenced to achieve optimal photosynthetic performance.
To determine if we could reconstitute a fully functional CCMcomplex in C3 plants, we transformed Arabidopsis and Camelina witha triple-gene CCM construct in which the expression of the HLA3,CA, and LCIA genes was driven by the green-tissue specific Cab1promoter. In both Arabidopsis and Camelina there was either asubstantial impairment in growth, or the plants did not survive(results not shown).
Thus, co-expression of the HLA3 gene with any other CCM gene(s)impaired growth even in plants in which expression of the other CCMgenes, e.g., LCIA in Camelina, or BCA in Arabidopsis, enhancedgrowth. These results indicated that HLA3 expression wasproblematic.
Since the HLA3 protein catalyzes active bicarbonate transport andis the first-dedicated step in the engineered CCM, we re-focusedour efforts on trying to determine why HLA3 expression was toxic toplants and how to mitigate its effects. We considered two possiblehypotheses for HLA3 toxicity: 1) expression of the HLA3ABC-transporter increases ATP demand (1 ATP/COO for photosynthesisby 25% and depletes cytoplasmic ATP levels [3-5,8] and 2) elevatedbicarbonate levels in HLA3 transgenic plants negatively impactcytoplasmic pH levels. With respect to the latter hypothesis, it isnoteworthy that unlike cyanobacteria, plants have robustcytoplasmic CA activity, potentially mitigating the effects ofelevated bicarbonate levels on cytoplasmic pH.
The Role of ATP Demand and Cyclic Electron Transfer Activity inCCMs
In contrast to air-grown algae (4 ATP/2 NADPH/CO.sub.2) and C4plants (5 ATP/2 NADPH/CO.sub.2) which have increased ATP demandsfor photosynthesis, C3 plants (3 ATP/2 NADPH/CO.sub.2) have limitedcapacity to generate additional ATP for each electron transferred[8-10]. Increasing ATP demand by 25% per carbon fixed in HLA3transgenic plants, therefore, could deplete cytoplasmic ATP levelsas well as alter the redox state of the cell [8,10]. One mechanismto increase ATP synthesis for each light-driven electrontransferred is by cyclic electron transfer (CET) activity.Light-driven CET is catalyzed by photosystem I (PSI) mediatedcharge separation leading to the reduction of ferredoxin (fd) andthe PGR5 protein. The PGR5 protein reduces and protonatesplastoquinone (PQ). PQH2 is then oxidized by the cytochrome b6fcomplex (Cyt b6f). Protons released from the oxidation of PQH2drive ATP synthesis. The electron transfer cycle is completed bythe reduction of plastocyanin (PC) by Cyt b6f, which in turn isoxidized by the PSI primary donor P700+. Significantly, molecularstudies have demonstrated that genes encoding proteins functionalin CET are substantially overexpressed (4-10.times.) in C4 plantsand air-grown algae relative to related C3 species or high CO.sub.2grown algae [9,11-17]. These CET genes include: the Proton GradientRegulation Genes PGR5 and PGRL1, and certain members of the Fd andferredoxin NADP reductase (FNR) gene families [8-15]: AccessionNos.: PGR5:NM_126585; PGRL1: NM_179091; Fd: AtFd1: At1g10960;AtFd2:At1g60950; FNR: LFNR1:At5g66190; LFRN2: At1g20020) [15]. Thesequence for the PRG5 protein with the transit peptide amino acidsequence underlined is provided as
TABLE-US-00002 (SEQ ID NO: 1) MAAASISAIG CNQTLIGTSF YGGWGSSISGEDYQTMLSKT VAPPQQARVS RKAIRAVPMMKNVNEGKGLF APLVVVTRNL VGKKRFNQLRGKAIALHSQV ITEFCKSIGA DAKQRQGLIRAKKNGERLG FL.
The transit peptide is cleaved to produce the functional PGR5protein.
To test the hypothesis that ATP depletion in HLA3 transgenicsresulted in growth impairment, we compared the phenotypes of VVTand HLA3 transgenics grown on nitrate which would require morelinear electron transport (LET) to facilitate the reduction ofnitrate. Significantly, the additional ATP produced by LET is notrequired for conversion of nitrate to ammonium and thus total ATPlevels are expected to increase. In contrast, plants grown onammonium do not require additional LET. Finally, we also grewtransgenics on ammonium with sucrose which would presumably provideadditional ATP via respiration [15,17]. We hypothesized that growthon nitrate or ammonium with sucrose would provide additional ATPthat could potentially drive HLA3 activity.
As shown in FIG. 2B, none of the Arabidopsis HLA3 transgenics (4independent lines) grew in the presence of ammonium, but all HLA3lines were rescued when grown on ammonium with sucrose.Furthermore, plants grown on ammonium plus sucrose werephenotypically similar to VVT (FIG. 2B). In contrast, all HLA3plants grown on nitrate survived, but some lines (#9, #20) hadsubstantially impaired growth phenotypes. Identical results wereobserved for the germination and growth of VVT and HLA3 transgenicseeds on MS media agar plates using either nitrate (HLA3transgenics survived) or ammonium (HLA3 transgenics died) as thesole nitrogen source (results not shown). Based on theseobservations, we propose that increased ATP synthesis associatedwith nitrate-driven LET and/or sucrose metabolism reduces thedepletion of cytoplasmic ATP levels in HLA3 transgenics and rescuesthem.
This interpretation was corroborated by comparative metaboliteanalyses of leaf energy charge (EC) status (ATP), inorganicphosphate levels, and leaf reductive potential (RP) of VVT and HLA3transgenic Arabidopsis grown on nitrate. As shown in FIG. 6, HLA3transgenics grown on nitrate had reduced EC and RP ratios relativeto WT. Energy charge is defined as([ATP]+1/2[ADP])/([ATP]+[ADP]+[AMP]). The reduction potential is ameasurement of the capacity of the system to gain or loseelectrons.
Significantly, inorganic phosphate levels were two-fold higher inHLA3 line #20, while the NADH level was two-fold lower than WT.
These results are consistent with the hypothesis that HLA3expression places increased ATP demand on plants. This increasedATP demand in HLA3 transgenics may be met in part via NAD(P)Hoxidation via the malate/oxaloacetate redox shunt between themitochondria and chloroplasts [16].
LCIA Phenotype Depends on Plant Species
As previously indicated, LCIA expression in transgenic Arabidopsisresulted in plants with severely depressed growth phenotypes (FIG.5A). In contrast, transgenic Camelina expressing LCIA had increasedgrowth rates as well as higher CO.sub.2-dependent photosyntheticrates relative to WT (FIG. 5B). We propose that the substantiallygreater carbon sink-strength of Camelina relative to Arabidopsisaccounts for the enhanced growth phenotype observed in CamelinaLCIA plants. In support of this hypothesis, we observed thatCamelina LCIA transgenics had higher CO.sub.2-dependent rates ofphotosynthesis and lower CO.sub.2 compensation points (40 vs. 53ppm CO.sub.2) than WT plants indicative of facilitated inorganiccarbon uptake by LCIA (FIG. 5C).
Overview: Enhancing photosynthetic carbon fixation by increasingATP production and limiting CO.sub.2 diffusion out of artificialCCM lines; Strategies for facilitating CET and ATP synthesis in C3plants
Prior attempts to subvert the limitations of photosynthesis havefocused on engineering RuBisCO throughput and specificity [35] byintroduction of engineered and non-native forms of the enzyme [36],through alterations in the regenerative capacity of the Calvincycle [37,38] or by engineering photorespiratory bypasses [39].These studies produced mixed results, thus advocating for a morecomprehensive systems-level approach to enhance and/or redirectphotosynthetic carbon flux.
As evidenced by our prior work described above, we postulate thatboth the carbon assimilatory steps and the light-based generationof ATP and NAPDH must be considered to develop a competent CCM withsignificantly improved photosynthetic capacity. To demonstrateproof of concept, an Arabidopsis line that contains a functionalCCM that includes mechanisms to adjust ATP levels to meettransporter demand will be generated.
Enhancing CET and ATP Synthesis to Support HLA3-DependentBicarbonate Uptake
To exploit the expression of an algal CCM in C3 plants requiresthat we meet the additional energy demands required to activelytransport inorganic carbon. As previously discussed in the sectionentitled "The role of ATP demand and cyclic electron transferactivity in CCMs", C4 plants and algae have robust CET activity,and overexpress a variety of genes involved in CET [13,16,40-45]compared to C3 plants.
Several strategies are identified in the following examples, toincrease ATP synthesis to support HLA3-dependent bicarbonatetransport. Several of these strategies focus on elevating CETactivity in C3 plants. Another approach involves the expression ofa green photon-driven bacterial proton pump in thylakoids tosupplement proton-driven ATP synthesis. Each approach is designedto complement existing CCM lines in Arabidopsis, Camelina, andpotato we have created, and are evaluated based upon measuredadenylate levels, plant biomass production, and photosyntheticmeasurements of carbon assimilation. The materials and methodsemployed in the examples below are for illustrative purposes only,and are not intended to limit the practice of the presentembodiments thereto. Any materials and methods similar orequivalent to those described herein as would be apparent to one ofordinary skill in the art can be used in the testing or practice ofthe present embodiments, i.e., the materials, methods, and examplesare illustrative only and not intended to be limiting.
Example 1: Enhancing CET Based on Overexpressing the ProtonGradient Regulatory Proteins PGR5 and PGRL1 in C3 Plants
Enhancing CET is based on overexpressing the proton gradientregulatory proteins PGR5 and/or PGRL1 which have previously beenshown to be important to CET [37].
It has recently been demonstrated that the PGRL1 protein hasantimycin A-sensitive (AA), ferredoxin-plastoquinone reductase(FQR) activity [46]. In Chlamydomonas, PGRL1 is part of theCytb6f/PSI supercomplex which mediates CET. Significantly, PGRL1forms homodimers as well as heterodimers with PGR5 via redox activecysteine residues. Under high-light conditions, thioredoxinredreduces PGRL1 dimers present in grana stacks, increasing theabundance of PGRL1 monomers and enhancing CET [47]. Mutationalstudies have shown that the PGR5 protein is required for Fdoxidation and PGRL1 reduction, but not for PQ reduction. Inaddition, it has been shown that PGRL1/PGR5 heterodimers are moreactive in CET than PGRL1 monomers. In C4 plants PGR5 and PGRL1expression levels are elevated (4.times.) relative to C3 plants[9]. Similarly, PGR5 expression is up-regulated in air-grownChlamydomonas (active CCM and HLA3 activity) relative to highCO.sub.2 (low CCM) grown cells [16,43]. Significantly,overexpression of PGRL1 and PGR5 has also been shown to increaseAA-sensitive CET in transgenic Arabidopsis [48]. One embodiment ofthe present invention provides for an overexpression of PGRL1 gene(SEQ ID NO:106) and PGR5 gene with chloroplast targeting sequence(SEQ ID NO:2) with HLA3 gene (SEQ ID NO:12) or with HLA3 gene (SEQID NO:12) and LCIA gene (SEQ ID NO:16) and BCA gene codon optimizedfor expression in Arabidopsis (SEQ ID NO:4) to yield substantiallyincreased photosynthetic rates, particularly in plants withenhanced sink strength (Camelina and potato for example).Co-expression of the PGR5 gene (SEQ ID NO:2) along with the HLA3gene (SEQ ID NO:12) in Camelina rescued the HLA3 gene and it was nolonger lethal. These results indicate that the PGR5 gene isenabling the production of sufficient ATP to meet the demands ofthe HLA3 gene product.
HLA3 (SEQ ID NO:12) and PGR5 (SEQ ID NO:2) are introduced as adouble construct into Arabidopsis or Camelina, byAgrobacterium-mediated Ti plasmid transformation using, forexample, plasmid pB110-HLA3-pgr5-dsred (FIG. 9). Since PGR5 protein(SEQ ID NO:1) is naturally targeted to the thylakoid membranes, noadditional targeting sequences are introduced. Similarly, sinceHLA3 protein (SEQ ID NO:77) is naturally targeted to thechloroplast envelope, no additional targeting sequences are added.HLA3 is codon optimized for plant expression.
In one embodiment, the expression of each protein is driven by thelight sensitive leaf-specific CAB1 promoter (SEQ ID NO:7) and Nosterminator (SEQ ID NO:9) (FIG. 9).
The BCA gene (AAW89307; SEQ ID NO:4), under the control of CAB1promoter, is introduced in to Arabidopsis by Agrobacterium-mediatedTi plasmid transformation by floral dip method using the constructshown in FIG. 10.
As a visual marker, the plasmid also includes a gene for expressionof fluorescent DsRed protein under the control of CVMV promoter andNos terminator (FIG. 10).
Plants are transformed by vacuum infiltration method (Lu and Kang(February, 2008) Plant Cell Rep. 27(2):273-8), and will be screenedfor biomass yield parameters (including plant weight, height,branching and seed yield) and photosynthetic efficiency measured asCO.sub.2 absorption with the aid of a LiCor 6400 gas exchangeanalyzer.
The PGRL1 gene from Arabidopsis (NM_179091 SEQ ID NO:3) will besubcloned into pCambia1301-based binary plasmid under control ofthe CAB1 promoter (SEQ ID NO:7) and Nos terminator (SEQ ID NO:9).The plasmid will also carry a gene for hygromycin selection marker.Agrobacterium-mediated transformation takes place by the standardfloral dip method followed by germination of seeds on hygromycin toselect for transformants. The expression of PRGL1 will be confirmedby RT-PCR, and the resulting transgenic plant lines will be crossedwith HLA3/PGR5 plants and screened for biomass yield andphotosynthesis rate (CO.sub.2 fixation).
Example 2: Determining if Fd1 Gene Overexpression can Support AlgalCCM and Increased Photosynthetic Rates
It has recently been demonstrated that specific members of theferredoxin (Fd) gene family facilitate CET. Overexpression of peaferredoxin1 (Fd1) enhanced CET at the expense of LET in tobacco[16,40].
Therefore, another embodiment of the present invention providesenhancing ATP production and titrating the expression of the peaFd1 gene in the three model C3 plants with and withoutco-expression of the CCM genes to determine if Fd1 overexpressioncan support the algal CCM and increased photosynthetic rates.Earlier results demonstrated that Fd1 overexpression slightlyimpaired Linear Electron Transfer (LET), resulting in a stuntedphenotype [40]. We expect that the additional ATP demand in HLA3transgenics, however, will mitigate these effects.
Fd1 gene (At1g10960) will be introduced by Agrobacterium-mediatedTi plasmid transformation. Fd1 gene will be subcloned intopCambia1301-based binary plasmid under control of CAB1 promoter(SEQ ID NO:7) and Nos terminator (SEQ ID NO:9). The plasmid willalso carry a gene for hygromycin selection as a marker.Agrobacterium-mediated transformation takes place by the standardfloral dip method, followed by germination of seeds on hygromycinto select for transformants. The expression of FD1 (SEQ ID NO:93)will be confirmed by real time QPCR, and the resulting plant linesexhibiting different levels of FD1 expression will be crossed withCCM-expressing plants and screened for biomass yield andphotosynthesis rate with the aid of a LiCor 6400 CO.sub.2-gasexchange analyzer.
Example 3: Overexpression of Unique Ferredoxin NADP Reductase (FNR)Gene Family Members Associated with CET
Yet another embodiment is based on overexpression of uniqueferredoxin NADP reductase (FNR) gene family members associated withCET. Leaf FNR (LFNR) catalyzes the reduction of Fd and is involvedin both LET and CET [15]. It was recently demonstrated that thereare three LFNR gene family members expressed in maize leaves:Accession Nos. BAA88236 (LFNR1), BAA88237 (LFNR2), and ACF85815(LFNR3).
LFNR-1 was shown to be localized to thylakoid membranes andassociated with Cytb6f complexes. LFNR2 was present in thylakoidsand stroma associated with Cytb6f complexes. LFNR3 was soluble andnot associated with Cytb6f complexes.
Significantly, when plants were grown with nitrate instead ofammonium, expression of LFNR1 and LFNR2 was elevated but not thatof LFNR3. In contrast, studies using Arabidopsis LFNR1 knock outmutants demonstrated that PGA-dependent oxygen evolution (whichrequires additional ATP) is more negatively affected than isnitrate-dependent oxygen evolution (no additional ATP demand),suggesting that LFNR1 may play a role in regulating CET [15].However, this interpretation remains equivocal.
To determine if CET activity and HLA3 mediated inorganic carbonuptake can be altered by differential expression of LFNR1, we willboth over-express (CAB1 promoter (SEQ ID NO:7)) and under-express(LFNR1 RNAi) LFNR1 in transgenic Arabidopsis to determine theimpact of altered LFNR1 expression on functional CCM activity.
For overexpression of the LFNR1, the gene (At5g66190) will beintroduced by Agrobacterium-mediated Ti plasmid transformation byfloral dipping. The LFNR1gene will be subcloned intopCambia1301-based binary plasmid under control of the CAB1 promoter(SEQ ID NO:7) and Nos terminator (SEQ ID NO:9). The plasmid willalso carry a gene for hygromycin selection as a marker. Theexpression of LFNR1 will be confirmed by real time QPCR, theresulting plant lines will be crossed with CCM-expressing plants,and screened for biomass yield and photosynthesis rate with the aidof a LiCor 6400 CO.sub.2-gas exchange analyzer.
For downregulaton of the LFNR1 levels, an RNAi construct containinga partial sequence of the LFNR1 (At5g66190 or BAA88236) and reversecomplementary sequence of LFNR1 will be subcloned intopCambia1301-based binary plasmid under control of the CAB1 promoter(SEQ ID NO:7) and Nos terminator (SEQ ID NO:9). The plasmid willalso carry a gene for hygromycin selection as a marker. The reducedlevel of LFNR1 expression will be confirmed by real time QPCR.
The resulting lines will be crossed with CCM-expressing lines togenerate double mutants. Those mutants will be screened for biomassyield parameters (including plant weight, height, branching andseed yield) and photosynthetic efficiency measured as CO.sub.2absorption with the aid of a LiCor 6400 gas exchange analyzer.
Example 4: Facilitated Vectoral Proton Transport UsingProteorhodopsin (PR)
In yet another embodiment green photons, not absorbed bychlorophyll, to drive proton transport across thylakoids byexpressing modified PR [49]) will be employed to enhance ATPsynthesis (FIG. 7).
PR is a seven-helix transmembrane-spanning protein similar tobacteriorhodopsin that contains retinal in its active site. Greenlight-driven cis-trans isomerization of retinal drives vectoralproton transfer across the membrane [50-55]. Significantly, it hasbeen demonstrated that a functional PR could be expressed in arespiration-impaired mutant of E. coli when supplemented withexogenous all-trans retinal [56]. More recently, hydrogenproduction was shown to increase nearly two-fold in PR-expressingE. coli when cells were exposed to increasing light intensities (70to 130 .mu.E), indicating that PR can efficiently absorb light evenat low intensities [57]. To the best of our knowledge, retinalcomplementation of other rhodopsins has not been reported.Significantly, PR-expressing E. coli respiratory mutants generatedsufficient proton-motive force to support ATP synthesis levels,leading to enhanced cell viability and motility when transgenicswere exposed to sunlight as the only energy source.
These results suggest that targeting PR to the thylakoid membraneusing appropriate targeting sequences (e.g., nuclear-encoded,N-terminal, light harvesting complex signal sequences) andsupplementation with exogenous retinal or retinal derived from.beta.-carotene cleavage) could drive additional ATP synthesis. Oneconcern is that the optical cross section of retinal is small andlight harvesting by PR is not supplemented by antenna complexes.This constraint may be overcome in part by overexpressing PR inthylakoids. Regardless, the additional proton gradient necessary tosupport HLA3 activity is substantially less than that required tosupport overall CO.sub.2 fixation. The best achievable PRexpression levels will be determined empirically using differentgene promoters, e.g., psaD (SEQ ID NO:10), rbcs (SEQ ID NO:11), andcab1 (SEQ ID NO:7), to drive its expression.
Generation of Improved PR and its Functional Reconstitution inChloroplasts
PR (AF279106), for example (SEQ ID NO:98), will be introduced intoArabidopsis, Camelina, and potato by Ti plasmid transformation andtargeted to the thylakoid membrane using the DNAJ transit peptide(At5g21430, SEQ ID NO: 22) or psbX stop-transfer trans-membranedomain (At2g06520 SEQ ID NO:23) fused to the C-terminus of PR [58],or transit peptides from nuclear encoded chloroplast proteins suchas CAB (SEQ ID NO:13), PGR5 (SEQ ID NO:14), and psaD (SEQ IDNO:15). Reconstitution with exogenous retinal will be carried outin a manner similar to strategies described for E. coli, exceptthat retinal will be painted on the surface of the leaf [56] todemonstrate proof of concept. Retinal reconstitution will befollowed by monitoring the absorption of the thylakoid membranes at540 nm [59].
If exogenously applied retinal is not incorporated into PR, we willexpress low levels of a plant codon-optimized .beta.-carotenemonooxygenase for example (SEQ ID NO:100) in plastids to cleave asmall fraction of .beta.-carotene to generate retinal. Non-limitingexamples of .beta.-carotene monooxygenases that can be usedinclude, for example, mouse, human, zebra fish, and rat enzymes(Accession Nos. AW044715, AK001592, AJ290390, and NM_053648,respectively). Alternatively, if .beta.-carotene levels areseverely depleted, we will transiently express .beta.-carotenemonooxygenase under the control of a transient inducible promotersuch as an ethanol inducible gene promoter. This is available as anEcoRl/Pstl fragment from Syngenta-Construct:pJL67-5S::AlcR/AlcA::GUS in pMLBART (Weigel World, Max PlanckInstitute for Developmental Biology, Tubingen, Germany) for periodsof time sufficient to fully saturate PR [60,61]. Operation of afunctional retinal photocycle in PR will be confirmed by transientabsorption spectroscopy [62].
Alternatively, promoters such as the green tissue/leaf-specificpromoters such as the CAB (At3g54890 SEQ ID NO:7) and rbcS(At5g38420 SEQ ID NO:11) promoters can be used, for example see SEQID NO:5 for the BCA protein with a rbc-1a transit peptide. As theskilled person will be well aware, various promoters may be used topromote the transcription of the nucleic acid of the invention,i.e. the nucleic acid which when transcribed yields an RNA moleculethat modulates the expression and/or activity of a proteinaccording to the invention. Such promoters include for exampleconstitutive promoters, inducible promoters (e.g. light induciblepromoters, stress-inducible promoters, drought-inducible promoters,hormone-inducible promoters, chemical-inducible promoters, etc.),tissue-specific promoters, developmentally regulated promoters andthe like.
Thus, a plant expressible promoter can be a constitutive promoter,i.e. a promoter capable of directing high levels of expression inmost cell types (in a spatio-temporal independent manner). Examplesof plant expressible constitutive promoters include promoters ofbacterial origin, such as the octopine synthase (OCS) and nopalinesynthase (NOS) promoters from Agrobacterium, but also promoters ofviral origin, such as that of the cauliflower mosaic virus (CaMV)35S transcript (Hapster et al., 1988, Mol. Gen. Genet. 212:182-190) or 19S RNAs genes (Odell et al., 1985, Nature. 6;313(6005):810-2; U.S. Pat. No. 5,352,605; WO 84/02913; Benfey etal., 1989, EMBO J. 8:2195-2202), the enhanced 2.times.35S promoter(Kay at al., 1987, Science 236:1299-1302; Datla et al. (1993),Plant Sci 94:139-149) promoters of the cassava vein mosaic virus(CsVMV; WO 97/48819, U.S. Pat. No. 7,053,205), 2.times.CsVMV(WO2004/053135) the circovirus (AU 689 311) promoter, the sugarcanebacilliform badnavirus (ScBV) promoter (Samac et al., 2004,Transgenic Res. 13(4):349-61), the figwort mosaic virus (FMV)promoter (Sanger et al., 1990, Plant Mol Biol. 14(3):433-43), thesubterranean clover virus promoter No 4 or No 7 (WO 96/06932) andthe enhanced 35S promoter as described in U.S. Pat. Nos. 5,164,316,5,196,525, 5,322,938, 5,359,142 and 5,424,200. Among the promotersof plant origin, mention will be made of the promoters of thepromoter of the Arabidopsis thaliana histone H4 gene (Chaboute etal., 1987), the ubiquitin promoters (Holtorf et al., 1995, PlantMol. Biol. 29:637-649, U.S. Pat. No. 5,510,474) of Maize, Rice andsugarcane, the Rice actin 1 promoter (Act-1, U.S. Pat. No.5,641,876), the histone promoters as described in EP 0 507 698 A1,the Maize alcohol dehydrogenase 1 promoter (Adh-1) (from the worldwide web at patentlens.net/daisy/promoters/242.html)).
A variety of plant gene promoters that regulate gene expression inresponse to environmental, hormonal, chemical, developmentalsignals, and in a tissue-active manner can be used for expressionof a sequence in plants. Choice of a promoter is based largely onthe phenotype of interest and is determined by such factors astissue (e.g., seed, fruit, root, pollen, vascular tissue, flower,carpel, etc.), inducibility (e.g., in response to heat, cold,drought, light etc.), timing, developmental stage, and thelike.
Promoters that can be used to practice this invention include thosethat are green tissue specific such as the promoter of lightharvesting complex protein 2 (Sakamoto et al. Plant CellPhysiology, 1991, 32(3): 385-393) or the promoter of the cytosolicfructose-1, 6-bisphosphatase from rice (Si et al. Acta BotanicaSinica 45: 3(2003): 359-364). Alternative embodiments include lightinducible promoters such as promoters of the plantribulose-biscarboxylase/oxygenase (Rubisco) small subunit promoter(U.S. Pat. No. 4,962,028; WO99/25842) from Zea mays and sunflower.Also the small subunit promoter from Chrysanthemum may be used,combined or not combined with the use of the respective terminator(Outchkourov et al., Planta, 216: 1003-1012, 2003).
Additional promoters that can be used to practice this inventionare those that elicit expression in response to stresses, such asthe RD29 promoters that are activated in response to drought, lowtemperature, salt stress, or exposure to ABA (Yamaguchi-Shinozakiet al., 2004, Plant Cell, Vol. 6, 251-264; WO12/101118), but alsopromoters that are induced in response to heat (e.g., see Ainley etal. (1993) Plant Mol. Biol. 22: 13-23), light (e.g., the pearbcS-3A promoter, Kuhlemeier et al. (1989) Plant Cell 1: 471-478,and the maize rbcS promoter, Schaffher and Sheen (1991) Plant Cell3: 997-1012); wounding (e.g., wunl, Siebertz et al. (1989) PlantCell 1: 961-968); pathogens (such as the PR-I promoter described inBuchel et al. (1999) Plant Mol. Biol. 40: 387-396, and the PDF 1.2promoter described in Manners et al. (1998) Plant Mol. Biol. 38:1071-1080), and chemicals such as methyl jasmonate or salicylicacid (e.g., see Gatz (1997) Annu. Rev. Plant Physiol. Plant Mol.Biol. 48: 89-108). In addition, the timing of the expression can becontrolled by using promoters such as those acting at senescence(e.g., see Gan and Amasino (1995) Science 270: 1986-1988); or lateseed development (e.g., see Odell et al. (1994) Plant Physiol. 106:447-458).
Use may also be made of salt-inducible promoters such as thesalt-inducible NHX1 promoter of rice landrace Pokkali (PKN) (Jahanet al., 6.sup.th International Rice Genetics symposium, 2009,poster abstract P4-37), the salt inducible promoter of the vacuolarH+-pyrophosphatase from Thellungiella halophila (TsVP1) (Sun etal., BMC Plant Biology 2010, 10:90), the salt-inducible promoter ofthe Citrus sinensis gene encoding phospholipid hydroperoxideisoform gpx1 (Avsian-Kretchmer et al., Plant Physiology July 2004vol. 135, p 1685-1696).
In alternative embodiments, tissue-specific and/or developmentalstage-specific promoters are used, e.g., promoter that can promotetranscription only within a certain time frame of developmentalstage within that tissue. See, e.g., Blazquez (1998) Plant Cell10:791-800, characterizing the Arabidopsis LEAFY gene promoter. Seealso Cardon (1997) Plant J 12:367-77, describing the transcriptionfactor SPL3, which recognizes a conserved sequence motif in thepromoter region of the A. thaliana floral meristem identity geneAPI; and Mandel (1995) Plant Molecular Biology, Vol. 29, pp995-1004, describing the meristem promoter eIF4. Tissue specificpromoters which are active throughout the life cycle of aparticular tissue can be used. Other promoters that can be used toexpress the nucleic acids of the invention include, a leaf-specificpromoter (see, e.g., Busk (1997) Plant J. 11:1285 1295, describinga leaf-specific promoter in maize); a tomato promoter active duringfruit ripening, senescence and abscission of leaves, a guard-cellpreferential promoter e.g. as described in PCT/EP12/065608, and, toa lesser extent, of flowers can be used (see, e.g., Blume (1997)Plant J. 12:731 746); the Blec4 gene from pea, which is active inepidermal tissue of vegetative and floral shoot apices oftransgenic alfalfa making it a useful tool to target the expressionof foreign genes to the epidermal layer of actively growing shootsor fibers; the ovule-specific BELI gene (see, e.g., Reiser (1995)Cell 83:735-742, GenBank No. U39944); and/or, the promoter in Klee,U.S. Pat. No. 5,589,583, describing a plant promoter region iscapable of conferring high levels of transcription in meristematictissue and/or rapidly dividing cells. Further tissue specificpromoters that may be used according to the invention include,promoters active in vascular tissue (e.g., see Ringli and Keller(1998) Plant Mol. Biol. 37: 977-988), carpels (e.g., see Ohl et al.(1990) Plant Cell 2. In alternative embodiments, plant promoterswhich are inducible upon exposure to plant hormones, such asauxins, are used to express the nucleic acids used to practice theinvention. For example, the invention can use the auxin-responseelements EI promoter fragment (AuxREs) in the soybean {Glycine maxL.) (Liu (1997) Plant Physiol. 115:397-407); the auxin-responsiveArabidopsis GST6 promoter (also responsive to salicylic acid andhydrogen peroxide) (Chen (1996) Plant J. 10: 955-966); theauxin-inducible parC promoter from tobacco (Sakai (1996)37:906-913); a plant biotin response element (Streit (1997) Mol.Plant Microbe Interact. 10:933-937); and, the promoter responsiveto the stress hormone abscisic acid (ABA) (Sheen (1996) Science274:1900-1902). Further hormone inducible promoters that may beused include auxin-inducible promoters (such as that described invan der Kop et al. (1999) Plant Mol. Biol. 39: 979-990 or Baumannet al., (1999) Plant Cell 11: 323-334), cytokinin-induciblepromoter (e.g., see Guevara-Garcia (1998) Plant Mol. Biol. 38:743-753), promoters responsive to gibberellin (e.g., see Shi et al.(1998) Plant Mol. Biol. 38: 1053-1060, Willmott et al. (1998) PlantMolec. Biol. 38: 817-825) and the like.
In alternative embodiments, nucleic acids used to practice theinvention can also be operably linked to plant promoters which areinducible upon exposure to chemicals reagents which can be appliedto the plant, such as herbicides or antibiotics. For example, themaize In2-2 promoter, activated by benzenesulfonamide herbicidesafeners, can be used (De Veylder (1997) Plant Cell Physiol.38:568-577); application of different herbicide safeners inducesdistinct gene expression patterns, including expression in theroot, hydathodes, and the shoot apical meristem. Coding sequencecan be under the control of, e.g., a tetracycline-induciblepromoter, e.g., as described with transgenic tobacco plantscontaining the Avena sativa L. (oat) arginine decarboxylase gene(Masgrau (1997) Plant J. 11:465-473); or, a salicylicacid-responsive element (Stange (1997) Plant J. 11:1315-1324).Using chemically- {e.g., hormone- or pesticide) induced promoters,i.e., promoter responsive to a chemical which can be applied to thetransgenic plant in the field, expression of a polypeptide of theinvention can be induced at a particular stage of development ofthe plant. Use may also be made of the estrogen-inducibleexpression system as described in U.S. Pat. No. 6,784,340 and Zuoet al. (2000, Plant J. 24: 265-273) to drive the expression of thenucleic acids used to practice the invention.
In alternative embodiments, a promoter may be used whose host rangeis limited to target plant species, such as corn, rice, barley,wheat, potato or other crops, inducible at any stage of developmentof the crop.
In alternative embodiments, a tissue-specific plant promoter maydrive expression of operably linked sequences in tissues other thanthe target tissue. In alternative embodiments, a tissue-specificpromoter that drives expression preferentially in the target tissueor cell type, but may also lead to some expression in other tissuesas well, is used.
According to the invention, use may also be made, in combinationwith the promoter, of other regulatory sequences, which are locatedbetween the promoter and the coding sequence, such as transcriptionactivators ("enhancers"), for instance the translation activator ofthe tobacco mosaic virus (TMV) described in Application WO87/07644, or of the tobacco etch virus (TEV) described byCarrington & Freed 1990, J. Virol. 64: 1590-1597, forexample.
Other regulatory sequences that enhance the expression of thenucleic acid of the invention may also be located within thechimeric gene. One example of such regulatory sequences is introns.Introns are intervening sequences present in the pre-mRNA butabsent in the mature RNA following excision by a precise splicingmechanism. The ability of natural introns to enhance geneexpression, a process referred to as intron-mediated enhancement(IME), has been known in various organisms, including mammals,insects, nematodes and plants (WO 07/098042, p 11-12). IME isgenerally described as a posttranscriptional mechanism leading toincreased gene expression by stabilization of the transcript. Theintron is required to be positioned between the promoter and thecoding sequence in the normal orientation. However, some intronshave also been described to affect translation, to function aspromoters or as position and orientation independenttranscriptional enhancers (Chaubet-Gigot et al., 2001, Plant MolBiol. 45(1):17-30, p 27-28).
Examples of genes containing such introns include the 5' intronsfrom the rice actin 1 gene (see U.S. Pat. No. 5,641,876), the riceactin 2 gene, the maize sucrose synthase gene (Clancy and Hannah,2002, Plant Physiol. 130(2):918-29), the maize alcoholdehydrogenase-1 (Adh-1) and Bronze-1 genes (Callis et al. 1987Genes Dev. 1(10):1183-200; Mascarenhas et al. 1990, Plant Mol Biol.15(6):913-20), the maize heat shock protein 70 gene (see U.S. Pat.No. 5,593,874), the maize shrunken 1 gene, the light sensitive 1gene of Solanum tuberosum, and the heat shock protein 70 gene ofPetunia hybrida (see U.S. Pat. No. 5,659,122), the replacementhistone H3 gene from alfalfa (Keleman et al. 2002 Transgenic Res.11(1):69-72) and either replacement histone H3 (histone H3.3-like)gene of Arabidopsis thaliana (Chaubet-Gigot et al., 2001, Plant MolBiol. 45(1):17-30).
Other suitable regulatory sequences include 5' UTRs. As usedherein, a 5' UTR, also referred to as a leader sequence, is aparticular region of a messenger RNA (mRNA) located between thetranscription start site and the start codon of the coding region.It is involved in mRNA stability and translation efficiency. Forexample, the 5' untranslated leader of a petunia chlorophyll a/bbinding protein gene downstream of the 35S transcription start sitecan be utilized to augment steady-state levels of reporter geneexpression (Harpster et al., 1988, Mol Gen Genet. 212(1):182-90).WO95/006742 describes the use of 5' non-translated leader sequencesderived from genes coding for heat shock proteins to increasetransgene expression.
The chimeric gene may also comprise a 3' end region, i.e. atranscription termination or polyadenylation sequence, operable inplant cells. As a transcription termination or polyadenylationsequence, use may be made of any corresponding sequence ofbacterial origin, such as for example the nos terminator ofAgrobacterium tumefaciens, of viral origin, such as for example theCaMV 35S terminator, or of plant origin, such as for example ahistone terminator as described in published Patent Application EP0 633 317 A1. The polyadenylation region can be derived from thenatural gene, from a variety of other plant genes, or from T-DNA.The 3' end sequence to be added may be derived from, for example,the nopaline synthase or octopine synthase genes, or alternativelyfrom another plant gene, or less preferably from any othereukaryotic gene.
The expression and targeting of proteorhodopsin to the thylakoidmembranes will take advantage of the green energy spectrum that isinaccessible to chlorophyll. An increase in the amount of ATP isexpected under photosynthesis conditions, from proton gradientgenerated both by the photosystems and the proteorhodopsin pump.Under conditions of inhibition of electron transfer through thephotosystems, we should be able to observe a steady rate of ATPsynthesis well above the basal rate through the activity of theproteorhodopsin proton pump.
Under normal pH conditions, protons are pumped into the bacterialperiplasmic space by PR [50]. The photo-driven retinal cycle beginswith photoisomerization of all trans-retinal to 13-cis retinal. Theresulting conformational change poises the system for transfer of aproton from the Schiff base (SB; pKa.about.11) to the counter ion,Asp 97 (pKa.about.7.5). The proton is transferred to the lumen viaa proton-conducting channel, and the SB is reprotonated from thecytoplasm. The mechanism of proton release in PR is not as wellunderstood as in bacteriorhodopsin (BR); however, the main eventsof the photocycle are expected to be similar to those of BR. Onepotential challenge for pumping protons by PR in thylakoidmembranes is the pH gradient-dependent reversibility of protontransfer by PR. At periplasmic pHs, <5.5, proton flow in PR isreversed, potentially depleting the proton gradient and impairingATP synthesis. Thus, at the lumenal pH of thylakoids (4.5),reversed proton transduction via PR is possible. One of thecritical residues involved in reversible proton flow is Asp97,which acts as the proton acceptor from retinal. The pKa of Asp97 inPR is .about.7.5, while the pKa of its counterpart in BR is.about.2.5. Due to the extremely low pKa of the counter ion, BR isable to retain its forward pumping activity at pHs as low as 3.5.The ability of PR to act as a proton pump in the thylakoid membranethus entails maintaining the pumping efficiency at low pHconditions prevailing in the lumen. We propose that vectoralpumping of protons into the thylakoid lumen can be achieved bylowering the pKa of Asp97 and/or by protecting the SB from thelumenal pH through rational, site-specific mutagenesis. Theelectrostatic environment around the SB in PR is presumablymaintained by the counter ions, Asp97, Asp227 (analogous to BRAsp212), Arg94 (analogous to BR Arg82) and His75. In BR, the lowpKa of Asp85 is attributed to its strong hydrogen bondinginteractions with Thr89 and Arg82 [53,54]. Since, interactions thatreduce the pKa of Asp97 will promote proton-pumping activity at lowexternal pH, mutation of Met79 to a residue that can hydrogen bondto His75 and Asp212, like Tyr or Thr, will be explored. Thesemutations are proposed by overlaying the structures of BR and PR,and identifying residues which are in a position to effect thedesired behavior. Finally, the ability of a modified PR to work asan efficient H+ pump at acidic pHs will also entail shielding theSB from the extracellular environment. To this end, a L219E/T206Smutant will be generated, wherein E219 and S206 will form a Glu-Sergate regulating vectoral proton transfer as occurs in BR.
To determine if any transgenes alter CET or ATP synthesis activity,we will compare the dark reduction kinetics of the photosystem Iprimary donor, P700+ in VVT and transgenic plants, with and withoutdibromothymoquinone (DBMIB), an inhibitor of Cytb6f-mediated CET.Dark P700+ reduction kinetics are expected to be faster in plantswith more active CET. In addition, we will assess the amplitude ofthe After Glow (AG) thermoluminescence band (-40.degree. C.)associated with CET activity [11,14,16,43,63]. Pool sizes of ATPwill also be assessed in VVT and transgenic plants by massspectroscopy.
Referring now to FIG. 11, additional transgenic Camelina lines wereproduced that expressed the BCA gene (SEQ ID NO:4) in thechloroplast stroma. These lines were produced using theAgrobacterium-mediated transformation procedures as describedpreviously. Three lines were evaluated for their ability toaccumulate biomass and provide improved photosynthetic rates.Wildtype Camelina and the BCA mutant lines were not significantlydifferent at lower light levels (0-400 umol/m.sup.2/s) in theirability to assimilate carbon dioxide. However, as light intensityincreased the BCA transformants showed between 10 and 30% higheraccumulation of CO.sub.2 at 2000 .mu.moles/m.sup.2/s than wildtype.The BCA line 9.2 was the highest while lines BCA 4.1 and BCA 5.7were both about 10% higher than wildtype. This improved ability toassimilate CO.sub.2 was reflected in two of the lines (BCA-5.7 andBCA-9.2) into increased biomass accumulation, with these lineshaving about 15% greater biomass accumulation than wildtype. TheBCA-4.1 line did not show improved biomass accumulation compared tocontrol.
Referring now to FIG. 12, the ability of the chloroplast envelopedlocalized bicarbonate transporter bicarbonate transporter (LCIA)protein to transport bicarbonate and improve the capture ofinorganic carbon by transgenic Camelina was determined followingthe method of Farquhar and colleagues (1989). LCIA transgenicCamelina were produced using the Agrobacterium-mediatedtransformation processed described previously. A LCIA expressingmutant line (CAM-LCIA) was compared to wildtype Camelina (Cam-WT)for the observed discrimination of the stable isotope .sup.13C.This carbon isotope discrimination is expressed as the differencebetween the .sup.13C in the air and in a plant which has beenpreviously exposed to .sup.13CO.sub.2, the carbon isotopediscrimination is symbolized by A and expressed in parts permillion (ppm) and is described by Farquhar and colleagues (1989).In the LCIA transgenic lines, the observed discrimination by theplant was 20% less than that observed in the wildtype. Thisindicates that the insertion of LCIA provides the plant the abilityto better accumulate and retain inorganic carbon than the wildtypeplant and shows decreased "leakiness" vs wildtype. Reference for.sup.13C discrimination: Carbon isotope discrimination andphotosynthesis, G. D. Farquhar, J. R. Ehlieringer and K. T. Hubick.Annu. Rev. Plant Physiol. Plant Mol. Biol. 1989, 40, 503-537.
TABLE-US-00003 TABLE D1 Kcat/ Kcat Km Km Ki Subcellular Tissue Iorgan Isoenzyme (s-1) (mM) (M.sup.-1s.sup.-1) (nM) localizationlocalization hCAI .sup. 2 .times. 10.sup.5 4.0 5.0 .times. 10.sup.7250 cytosol E, GI hCAII .sup. 1.4 .times. 10.sup.6o 9.3 1.5 .times.10.sup.8 12 cytosol E, eye, GI, BO, K, L, T, B hCAIII 1.0 .times.10.sup.4 33.3 3.0 .times. 10.sup.5 2 .times. 10.sup.5 cytosol SM, AhCAIV 1.0 .times. 10.sup.6 21.5 5.1 .times. 10.sup.7 74 membrane K,L, P, B, C, H hCAVA 2.9 .times. 10.sup.5 10.0 2.9 .times. 10.sup.763 mitochondria Li hCAVB 9.5 .times. 10.sup.5 9.7 9.8 .times.10.sup.7 54 mitochondria H, SM, P, K, SC, GI hCAVI 3.4 .times.10.sup.5 6.9 4.9 .times. 10.sup.7 11 secreted G hCAVII 9.5 .times.10.sup.5 11.4 8.3 .times. 10.sup.1 2.5 cytosol CNS hCAVIII cytosolCNS hCAIX 3.8 .times. 10.sup.5 6.9 5.5 .times. 10.sup.7 25transmembrane TU, GI hCAX cytosol CNS hCAXI cytosol CNS hCAXII 4.2.times. 10.sup.5 12.0 3.5 .times. 10.sup.7 5.7 transmembrane R, I,RE, eye, TU hCAXIII 1.5 .times. 10.sup.5 13.8 1.1 .times. 10.sup.716 cytosol K, B, L, GI, RE hCAXIV 3.1 .times. 10.sup.5 7.9 3.9.times. 10.sup.7 41 transmembrane K, B, L hCAXV 4.7 .times.10.sup.5 14.2 3.3 .times. 10.sup.7 72 membrane K H = Human; M =Mouse; hCAVIII, X, and XI are devoid of catalytic activity. E =Erthrocyes; GI = GI tract; BO = Bone osteoclasts; K = kidney, L =Lung; T = testis; B = brain; SM = skeletal muscle; A = Adipocytes;P = pancreas; C = colon; H = heart; Li = liver; SC = spinal cord; G= salivary and mammary gland; R = renal; I = intestinal; TU =tumors, RE = Reproductive
TABLE-US-00004 TABLE D2 Exemplary Type II Carbonic AnhydrasesAccession Organism Sequence Number SEQ. ID. NO Human MSHHWGYGKHNGPEHWHKDF PIAKGERQSP VDIDTHTAKY NP-000058.1 SEQ. ID. NO. 19DPSLKPLSVS YDQATSLRIL NNGHAFNVEF DDSQDKAVLK GGPLDGTYRL IQFHFHWGSLDGQGSEHTVD KKKYAAELHL VHWNTKYGDF GKAVQQPDGL AVLGIFLKVG SAKPGLQKVVDVLDSIKTKG KSADFTNFDP RGLLPESLDY WTYPGSLTTP PLLECVTWIV LKEPISVSSEQVLKFRKLNF NGEGEPEELM VDNWRPAQPL KNRQIKASFK Macaca MSHHWGYGKHNGPEHWHKDF PIAKGQRQSP VDIDTHTAKY BAE91302.1 SEQ. ID. NO. 24fascicularis DPSLKPLSVS YDQATSLRIL NNGHSFNVEF DDSQDKAVIK(crab-eating GGPLDGTYRL IQFHFHWGSL DGQGSEHTVD KKKYAAELHL macaque)VHWNTKYGDF GKAVQQPDGL AVLGIFLKVG SAKPGLQKVV DVLDSIKTKG KSADFTNFDPRGLLPESLDY WTYPGSLTTP PLLECVTWIV LKEPISVSSE QMSKFRKLNF NGEGEPEELMVDNWRPAQPL KNRQIKASFK Pan MSHHWGYGKH NGPEHWHKDF PIAKGERQSPVDIDTHTAKY NP_001181853 SEQ. ID. NO. 25 troglodytes DPSLKPLSVSYGQATSLRIL NNGHAFNVEF DDSQDKAVLK GGPLDGTYRL IQFHFHWGSL DGQGSEHTVDKKKYAAELHL VHWNTKYGDF GKAVQQPDGL AVLGIFLKVG SAKPGLQKVV DVLDSIKTKGKSADFTNFDP HGLLPESLDY WTYPGSLTTP PLLECVTWIV LKEPISVSSE QMLKFRKLNFNGEGEPEELM VDNWRPAQPL KNRQIKASFK Macaca MSHHWGYGKH NGPEHWHKDFPIAKGQRQSP VDINTHTAKY NP_001182346 SEQ. ID. NO. 26 mulattaDPSLKPLSVS YDQATSLRIL NNGHSFNVEF DDSQDKAVIK GGPLDGTYRL IQFHFHWGSLDGQGSEHTVD KKKYAAELHL VHWNTKYGDF GKAVQQPDGL AVLGIFLKVG SAKPGLQKVVDVLDSIKTKG KSADFTNFDP RGLLPESLDY WTYPGSLTTP PLLECVTWIV LKEPISVSSEQMSKFRKLNF NGEGEPEELM VDNWRPAQPL KNRQIKASFK Pongo abelii MSHHWGYGKHNGPEHWHKDF PIAKGERQSP VDIDTHTAKY XP_002819286 SEQ. ID. NO. 27DPSLKPLSVC YDQATSLRIL NNGHSFNVEF DDSQDKAVLK GGPLDGTYRL IQFHFHWGSLDGQGSEHTVD KKKYAAELHL VHWNTKYGDF GKAVQQPDGL AVLGIFLKVG SAKPGLQKVVDVLDSIKTKG KCADFTNFDP RGLLPASLDY WTYPGSLTTP PLLECVTWIV LKEPISVSSEQMLKFRKLNF NGEGEPEELM VDNWRPAQPL KKRQIKASFK Callithrix MSHHWGYGKHNGPEHWHKDF PIAKGERQSP VDIDTHTAKY XP_002759086 SEQ. ID. NO. 28jacchus DPSLKPLSVS YDQATSWRIL NNGHSFNVEF DDSQDKAVLK GGPLDGTYRLIQFHFHWGST DGQGSEHTVD KKKYAAELHL VHWNTKYGDF GKAAQQPDGL AVLGIFLKVGSAKPGLQKVV DVLDSIKTKG KSADFTNFDP RGLLPESLDY WTYPGSLTTP PLLESVTWIVLKEPISVSSE QILKFRKLNF SGEGEPEELM VDNWRPAQPL KNRQIKASFK Lemur cattaMSHHWGYGKH NGPEHWHKDF PIAKGERQSP VDINTGAAKH ADD83028 SEQ. ID. NO.29 DPSLKPLSVY YEQATSRRIL NNGHSFNVEF DDSQDKAVLK GGPLDGTYRLIQFHFHWGSL DGQGSEHTVD KKKYAAELHL VHWNTKYGDF GKAVQQPDGL AVLGIFLKVGSAKPGLQKVV DVLDSIKTKG KSADFTNFDP RGLLPESLDY WTYLGSLTTP PLLECVTWIVLKEPISVSSE QMMKFRKLSF SGEGEPEELM VDNWRPAQPL KNRQIKASFK AiluropodaMAHHWGYGKH NGPEHWYKDF PIAKGQRQSP VDIDTKAAIH XP_002916939 SEQ. ID.NO. 30 melanoleuca DPALKALCPT YEQAVSQRVI NNGHSFNVEF DDSQDNAVLKGGPLTGTYRL IQFHFHWGSS DGQGSEHTVD KKKYAAELHL VHWNTKYGDF GKAVQQPDGLAVLGIFLKIG DARPGLQKVL DALDSIKTKG KSADFTNFDP RGLLPESLDY WTYPGSLTTPPLLECVTWIV LKEPISVSSE QMLKFRRLNF NKEGEPEELM VDNWRPAQPL HNRQINASFKEguus MSHHWGYGQH NGPKHWHKDF PIAKGQRQSP VDIDTKAAVH XP_001488540 SEQ.ID. NO. 31 caballus DAALKPLAVH YEQATSRRIV NNGHSFNVEF DDSQDKAVLQGGPLTGTYRL IQFHFHWGSS DGQGSEHTVD KKKYAAELHL VHWNTKYGDF GKAVQQPDGLAVVGVFLKVG GAKPGLQKVL DVLDSIKTKG KSADFTNFDP RGLLPESLDY WTYPGSLTTPPLLECVTWIV LREPISVSSE QLLKFRSLNF NAEGKPEDPM VDNWRPAQPL NSRQIRASFKCanis lupus MAHHWGYAKH NGPEHWHKDF PIAKGERQSP VDIDTKAAVHNP_001138642 SEQ. ID. NO. 32 familiaris DPALKSLCPC YDQAVSQRIINNGHSFNVEF DDSQDKTVLK GGPLTGTYRL IQFHFHWGSS DGQGSEHTVD KKKYAAELHLVHWNTKYGEF GKAVQQPDGL AVLGIFLKIG GANPGLQKIL DALDSIKTKG KSADFTNFDPRGLLPESLDY WTYPGSLTTP PLLECVTWIV LKEPISVSSE QMLKFRKLNF NKEGEPEELMMDNWRPAQPL HSRQINASFK Oryctolagus MSHHWGYGKH NGPEHWHKDF PIANGERQSPIDIDTNAAKH NP_001182637 SEQ. ID. NO. 33 cuniculus DPSLKPLRVCYEHPISRRII NNGHSFNVEF DDSHDKTVLK EGPLEGTYRL IQFHFHWGSS DGQGSEHTVNKKKYAAELHL VHWNTKYGDF GKAVKHPDGL AVLGIFLKIG SATPGLQKVV DTLSSIKTKGKSVDFTDFDP RGLLPESLDY WTYPGSLTTP PLLECVTWIV LKEPITVSSE QMLKFRNLNFNKEAEPEEPM VDNWRPTQPL KGRQVKASFV Ailuropoda GPEHWYKDFP IAKGQRQSPVDIDTKAAIHD PALKALCPTY EFB24165 SEQ. ID. NO. 34 melanoleucaEQAVSQRVIN NGHSFNVEFD DSQDNAVLKG GPLTGTYRLI QFHFHWGSSD GQGSEHTVDKKKYAAELHLV HWNTKYGDFG KAVQQPDGLA VLGIFLKIGD ARPGLQKVLD ALDSIKTKGKSADFTNFDPR GLLPESLDYW TYPGSLTTPP LLECVTWIVL KEPISVSSEQ MLKFRRLNFNKEGEPEELMV DNWRPAQPLH NRQINASFK Sus scrofa MSHHWGYDKH NGPEHWHKDFPIAKGDRQSP VDINTSTAVH XP_001927840.1 SEQ. ID. NO. 35 DPALKPLSLCYEQATSQRIV NNGHSFNVEF DSSQDKGVLE GGPLAGTYRL IQFHFHWGSS DGQGSEHTVDKKKYAAELHL VHWNTKYKDF GEAAQQPDGL AVLGVFLKIG NAQPGLQKIV DVLDSIKTKGKSVEFTGFDP RDLLPGSLDY WTYPGSLTTP PLLESVTWIV LREPISVSSG QMMKFRTLNFNKEGEPEHPM VDNWRPTQPL KNRQIRASFQ Callithrix MSHHWGYGKH NGPEHWHKDFPIAKGERQSP VDIDTHTAKY XP_002759087 SEQ. ID. NO. 36 jacchusDPSLKPLSVS YDQATSWRIL NNGHSFNVEF DDSQDKAVLK GGPLDGTYRL IQLHLVHWNTKYGDFGKAAQ QPDGLAVLGI FLKVGSAKPG LQKVVDVLDS IKTKGKSADF TNFDPRGLLPESLDYWTYPG SLTTPPLLES VTWIVLKEPI SVSSEQILKF RKLNFSGEGE PEELMVDNWRPAQPLKNRQI KASFK Mus MSHHWGYSKH NGPENWHKDF PIANGDRQSP VDIDTATAQHNP_033931 SEQ. ID. NO. 37 musculus DPALQPLLIS YDKAASKSIV NNGHSFNVEFDDSQDNAVLK GGPLSDSYRL IQFHFHWGSS DGQGSEHTVN KKKYAAELHL VHWNTKYGDFGKAVQQPDGL AVLGIFLKIG PASQGLQKVL EALHSIKTKG KRAAFANFDP CSLLPGNLDYWTYPGSLTTP PLLECVTWIV LREPITVSSE QMSHFRTLNF NEEGDAEEAM VDNWRPAQPLKNRKIKASFK Bos taurus MSHHWGYGKH NGPEHWHKDF PIANGERQSP VDIDTKAVVQNP_848667 SEQ. ID. NO. 38 DPALKPLALV YGEATSRRMV NNGHSFNVEYDDSQDKAVLK DGPLTGTYRL VQFHFHWGSS DDQGSEHTVD RKKYAAELHL VHWNTKYGDFGTAAQQPDGL AVVGVFLKVG DANPALQKVL DALDSIKTKG KSTDFPNFDP GSLLPNVLDYWTYPGSLTTP PLLESVTWIV LKEPISVSSQ QMLKFRTLNF NAEGEPELLM LANWRPAQPLKNRQVRGFPK Oryctolagus GKHNGPEHWH KDFPIANGER QSPIDIDTNA AKHDPSLKPLAAA80531 SEQ. ID. NO. 39 cuniculus RVCYEHPISR RIINNGHSFN VEFDDSHDKTVLKEGPLEGT YRLIQFHFHW GSSDGQGSEH TVNKKKYAAE LHLVHWNTKY GDFGKAVKHPDGLAVLGIFL KIGSATPGLQ KVVDTLSSIK TKGKSVDFTD FDPRGLLPES LDYWTYPGSLTTPPLLECVT WIVLKEPITV SSEQMLKFRN LNFNKEAEPE EP Rattus MSHHWGYSKSNGPENWHKEF PIANGDRQSP VDIDTGTAQH NP062164 SEQ. ID. NO. 40norvegicus DPSLQPLLIC YDKVASKSIV NNGHSFNVEF DDSQDFAVLK EGPLSGSYRLIQFHFHWGSS DGQGSEHTVN KKKYAAELHL VHWNTKYGDF GKAVQHPDGL AVLGIFLKIGPASQGLQKIT EALHSIKTKG KRAAFANFDP CSLLPGNLDY WTYPGSLTTP PLLECVTWIVLKEPITVSSE QMSHFRKLNF NSEGRAEELM VDNWRPAQPL KNRKIKASFK
TABLE-US-00005 TABLE D3 Exemplary Type VII Carbonic AnhydrasesAccession Organism Sequence Number SEQ. ID. NO Human MSLSITNNGHSVQVDFNDSD DRTVVTGGPL EGPYRLKQFH SEQ. ID. NO. 41 FHWGKKHDVGSEHTVDGKSF PSELHLVHWN AKKYSTFGEA ASAPDGLAVV GVFLETGDEH PSMNRLTDALYMVRFKGTKA QFSCFNPKCL LPASRHYWTY PGSLTTPPLS ESVTWIVLRE PICISERQMGKFRSLLFTSE DDERIHMVNN FRPPQPLKGR VVKASFRA Pongo MTGHHGWGYGQDDGPSHWHK LYPIAQGDRQ SPINIISSQA XP_002826555 SEQ. ID. NO. 42abelii VYSPSLQPLE LSYEACMSLS ITNNGHSVQV DFNDSDDRTV VTGGPLEGPYRLKQFHFHWG KKHDVGSEHT VDGKSFPSEL HLVHWNAKKY STFGEAASAP DGLAVVGVFLETGDEHPSMN RLTDALYMVR FKGTKAQFSCFNPKSLLPAS RHYWTYPGSL TTPPLSESVTWIVLREPICI SERQMGKFRS LLFTSEDDER IHMVNNFRPP QPLKGRVVKA SFRA PanMEFGLSPELS PSRCFKRLLR GSERGRSRSP NERTEPTGQV XP_001143159.1 SEQ. ID.NO. 43 troglodytes HGCGDGSGMT GHHGWGYGQD DGPSHWHKLY PIAQGDRQSPINIISSQAVY SPSLQPLELS YEACMSLSIT NNGHSVQVDF NDSDDRTVVT GGPLEGPYRLKQFHFHWGKK HDVGSEHTVD GKSFPSELHL VHWNAKKYST FGEAASAPDG LAVVGVFLETGDEHPSMNRL TDALYMVRFK GTKAQFSCFN PKCLLPASRH YWTYPGSLTT PPLSESVTWIVLREPICISE RQMRKFRSLL FTSEDDERIH MVNNFRPPQP LKGRVVKASF RACallithrix MTGHHGWGYG QDDGPSHWHK LYPIAQGDRQ SPINIISSQA XP_002761099SEQ. ID. NO. 44 jacchus VYSPSLQPLE LSYEACMSLS ITNNGHSVQV DFNDSDDRTVVTGGPLEGPY RLKQFHFHWG KKHDVGSEHT VDGKSFPSEL HLVHWNAKKY STFGEAASAPDGLAVVGVFL ETGDEHPSMN RLTDALYMVR FKGTKAQFSC FNPKCLLPAS WHYWTYPGSLTTPPLSESVT WIVLREPICI SERQMGKFRS LLFTSEDDER VHMVNNFRPP QPLKGRVVKASFRA Ailuropoda GPSQWHKLYP IAQGDRQSPI NIVSSQAVYS PSLKPLELSYEFB15849 SEQ. ID.NO. 45 melanoleuca EACISLSIAN NGHSVQVDFNDSDDRTVVTG GPLDGPYRLK QFHFHWGKKH SVGSEHTVDG KSFPSELHLV HWNAKKYSTFGEAASAPDGL AVVGVFLETG DEHPSMNRLT DALYMVRFKG TKAQFSCFNP KCLLPASRHYWTYPGSLTTP PLSESVTWIV LREPISISER QMEKFRSLLF TSEDDERIHM VNNFRPPQPLKGRVVKASFR A Canis MTGHHCWGYG QNDEIQASLS PSLSTPAGPS QWHKLYPIAQXP_546892 SEQ. ID. NO. 46 familiaris GDRQSPINIV SSQAVYSPSLKPLELSYEAC ISLSITNNGH SVQVDFNDSD DRTAVTGGPL DGPYRLKQLH FHWGKKHSVGSEHTVDGKSF PSELHLVHWN AKKYSTFGEA ASAPDGLAVV GIFLETGDEH PSMNRLTDALYMVRFKGTKA QFSCFNPKCL LPASRHYWTY PGSLTTPPLS ESVTWIVLRE PISISERQMEKFRSLLFTSE EDERIHMVNN FRPPQPLKGR VVKASFRA Bos taurus MTGHHGWGYGQNDGPSHWHK LYPIAQGDRQ SPINIVSSQA XP_002694851 SEQ. ID. NO. 47VYSPSLKPLE ISYESCTSLS IANNGHSVQV DFNDSDDRTV VSGGPLDGPY RLKQFHFHWGKKHGVGSEHT VDGKSFPSEL HLVHWNAKKY STFGEAASAP DGLAVVGVFL ETGDEHPSMNRLTDALYMVR FKGTKAQFSC FNPKCLLPAS RHYWTYPGSL TTPPLSESVT WIVLREPIRISERQMEKFRS LLFTSEEDER IHMVNNFRPP QPLKGRVVKA SFRA Rattus MTVLWWPMLREELMSKLRTG GPSNWHKLYP IAQGDRQSPI EDL87229 SEQ. ID. NO. 48norvegicus NIISSQAVYS PSLQPLELFY EACMSLSITN NGHSVQVDFN DSDDRTVVAGGPLEGPYRLK QLHFHWGKKR DVGSEHTVDG KSFPSELHLV HWNAKKYSTF GEAAAAPDGLAVVGIFLETG DEHPSMNRLT DALYMVRFKD TKAQFSCFNP KCLLPTSRHY WTYPGSLTTPPLSESVTWIV LREPIRISER QMEKFRSLLF TSEDDERIHM VNNFRPPQPL KGRVVKASFQ SOryctolagus MTGHHGWGYG QDDGGRPSHW HKLYPIAQGD RQSPINIVSSXP_002711604 SEQ. ID. NO. 49 cuniculus QAVYSPGLQP LELSYEACTSLSIANNGHSV QVDFNDSDDR TVVTGGPLEG PYRLKQFHFH WGKRRDAGSE HTVDGKSFPSELHLVHWNAR KYSTFGEAAS APDGLAVVGV FLETGNEHPS MNRLTDALYM VRFKGTKAQFSCFNPKCLLP SSRHYWTYPG SLTTPPLSES VTWIVLREPI SISERQMEKF RSLLFTSEDDERVHMVNNFR PPQPLRGRVV KASFRA Mus GQDDGPSNWH KLYPIAQGDR QSPINIISSQAVYSPSLQPL AAG16230.1 SEQ. ID. NO. 50 musculus ELFYEACMSLSITNNGHSVQ VDFNDSDDRT VVSGGPLEGP YRLKQLHFHW GKKRDMGSEH TVDGKSFPSELHLVHWNAKK YSTFGEAAAA PDGLAVVGVF LETGDEHPSM NRLTDALYMV RFKDTKAQFSCFNPKCLLPT SRHYWTYPGS LTTPPLSESV TWIVLREPIR ISERQMEKFR SLLFTSEDDERIHMVDNFRP PQPLKGRVVK ASFQA Monodelphis MTGHHGWGYG QEDGPSEWHKLYPIAQGDRQ SPIDIVSSQA XP_001364411.1 SEQ. ID. NO. 51 domesticVYDPTLKPLV LAYESCMSLS IANNGHSVMV EFDDVDDRTV VNGGPLDGPY RLKQFHFHWGKKHSLGSEHT VDGKSFSSEL HLVHWNGKKY KTFAEAAAAP DGLAVVGIFL ETGDEHASMNRLTDALYMVR FKGTKAQFNS FNPKCLLPMN LSYWTYPGSL TTPPLSESVT WIVLKEPITISEKQMEKFRS LLFTAEEDEK VRMVNNFRPP QPLKGRVVQA SFRS Gallus MTGHHSWGYGQDDGPAEWHK SYPIAQGNRQ SPIDIISAKA XP_414152.1 SEQ. ID. NO. 52 gallusVYDPKLMPLV ISYESCTSLN ISNNGHSVMV EFEDIDDKTV ISGGPFESPF RLKQFHFHWGAKHSEGSEHT IDGKPFPCEL HLVHWNAKKY ATFGEAAAAP DGLAVVGVFL EIGKEHANMNRLTDALYMVK FKGTKAQFRS FNPKCLLPLS LDYWTYLGSL TTPPLNESVI WVVLKEPISISEKQLEKFRM LLFTSEEDQK VQMVNNFRPP QPLKGRTVRA SFKA TaeniopygiaMTGQHSWGYG QADGPSEWHK AYPIAQGNRQ SPIDIDSARA XP_002190292.1 SEQ. ID.NO. 53 guttata VYDPSLQPLL ISYESCSSLS ISNTGHSVMV EFEDTDDRTAISGGPFQNPF RLKQFHFHWG TTHSQGSEHT IDGKPFPCEL HLVHWNARKY TTFGEAAAAPDGLAVVGVFL EIGKEHASMN RLTDALYMVK FKGTKAQFRG FNPKCLLPLS LDYWTYLGSLTTPPLNESVT WIVLKEPIRI SVKQLEKFRM LLFTGEEDQR IQMANNFRPP QPLKGRIVRASFKA
TABLE-US-00006 TABLE D4 Exemplary Type XIII Carbonic AnhydrasesAccession Organism Sequence Number SEQ. ID. NO Human MSRLSWGYREHNGPIHWKEF FPIADGDQQS PIEIKTKEVK NP_940986.1 SEQ. ID. NO. 54YDSSLRPLSI KYDPSSAKII SNSGHSFNVD FDDTENKSVL RGGPLTGSYR LRQVHLHWGSADDHGSEHIV DGVSYAAELH VVHWNSDKYP SFVEAAHEPD GLAVLGVFLQ IGEPNSQLQKITDTLDSIKE KGKQTRFTNF DLLSLLPPSW DYWTYPGSLT VPPLLESVTW IVLKQPINISSQQLAKFRSL LCTAEGEAAA FLVSNHRPPQ PLKGRKVRAS FH Pan MSRLSWGYREHNGPIHWKEF FPIADGDQQS PIEIKTKEVK XP_001169377.1 SEQ. ID. NO. 55troglodytes YDSSLRPLSI KYDPSSAKII SNSGHSFNVD FDDTENKSVL RGGPLTGSYRLRQFHLHWGS ADDHGSEHIV DGVSYAAELH VVHWNSDKYP SFVEAAHEPD GLAVLGVFLQIGEPNSQLQK ITDTLDSIKE KGKQTRFTNF DPLSLLPPSW DYWTYPGSLT VPPLLESVTWIVLKQPINIS SQQLAKFRSL LCTAEGEAAA FLVSNHRPPQ PLKGRKVRAS FH MacacaMSRLSWGYRE HNGPIHWKEF FPIADGDQQS PIEIKTQEVK XP_001095487.1 SEQ. ID.NO. 56 mulatta YDSSLRPLSI KYDPSSAKII SNSGHSFNVD FDDTEDKSVLRGGPLAGSYR LRQFHLHWGS ADDHGSEHIV DGVSYAAELH VVHWNSDKYP SFVEAAHEPDGLAVLGVFLQ IGEPNSQLQK ITDILDSIKE KGKQTRFTNF DPLSLLPPSW DYWTYPGSLTVPPLLESVIW IVLKQPINVS SQQLAKFRSL LCTAEGEAAA FLLSNHRPPQ PLKGRKVRASFR Oryctolagus MSRISWGYGE HNGPIHWNQF FPIADGDQQS PIEIKTKEVKXP_002710714.1 SEQ. ID. NO. 57 cuniculus YDSSLRPLSI KYDPSSAKIISNSGHSFNVD FDDTEDKSVL RGGPLTGNYR LRQFHLHWGS ADDHGSEHVV DGVRYAAELHVVHWNSDKYP SFVEAAHEPD GLAVLGVFLQ IGEYNSQLQK ITDILDSIKE KGKQTRFTNFDPLSLLPSSW DYWTYPGSLT VPPLLESVTW IVLKQPINIS SQQLAKFRSL LCSAEGESAAFLLSNHRPPQ PLKGRKVRAS FH Ailuropoda MSRLSWGYGE HNGPIHWNKFFPIADGDQQS PIEIKTKEVK XP_002916937.1 SEQ. ID. NO. 58 melanoleucaYDSSLRPLSI KYDANSAKII SNSGHSFSVD FDDTEDKSVL RGGPLTGSYR LRQFHLHWGSADDHGSEHVV DGVRYAAELH VVHWNSDKYP SFVEAAHEPD GLAVLGVFLQ IGEHNSQLQKITDILDSIKE KGKQTRFTNF DPLSLLPPSW DYWTYPGSLT VPPLLESVTW IVLKQPINISSEQLATFRTL LCTAEGEAAA FLLSNHRPPQ PLKGRKVRAS FH Sus scrofaMSRFSWGYGE HNGPVHWNEF FPIADGDQQS PIEIKTKEVK XP_001924497.1 SEQ. ID.NO. 59 YDSSLRPLSI KYDPSSAKII SNSGHSFSVD FDDTEDKSVL RGGPLTGSYRLRQFHLHWGS ADDHGSEHVV DGVKYAAELH VVHWNSDKYP SFVEAAHEPD GLAVLGVFLQIGEHNSQLQK ITDILDSIKE KGKQTRFTNF DPLSLLPPSW DYWTYPGSLT VPPLLESVTWIILKQPINIS SQQLATFRTL LCTKEGEEAA FLLSNHRPLQ PLKGRKVRAS FHCallithrix MSRLSWGYGE HNGPIHWNEF FPIADGDRQS PIEIKAKEVKXP_002759085.1 SEQ. ID. NO. 60 jacchus YDSSLRPLSI KYDPSSAKIISNSGHSFNVD FDDTEDKSVL HGGPLTGSYR LRQFHLHWGS ADDHGSEHVV DGVRYAAELHVVHWNSEKYP SFVEAAHEPD GLAVLGVFLQ IGEPNSQLQK IIDILDSIKE KGKOIRFTNFDPLSLFPPSW DYWTYSGSLT VPPLLESVTW ILLKQPINIS SQQLAKFRSL LCTAEGEAAAFLLSNYRPPQ PLKGRKVRAS FR Rattus MARLSWGYDE HNGPIHWNEL FPIADGDQQSPIEIKTKEVK NP_001128465.1 SEQ. ID. NO. 61 norvegicus YDSSLRPLSIKYDPASAKII SNSGHSFNVD FDDTEDKSVL RGGPLTGSYR LRQFHLHWGS ADDHGSEHVVDGVRYAAELH VVHWNSDKYP SFVEAAHESD GLAVLGVFLQ IGEHNPQLQK ITDILDSIKEKGKQTRFTNF DPLCLLPSSW DYWTYPGSLT VPPLLESVTW IVLKQPISIS SQQLARFRSLLCTAEGESAA FLLSNHRPPQ PLKGRRVRAS FY Mus MARLSWGYGE HNGPIHWNELFPIADGDQQS PIEIKTKEVK NP_078771.1 SEQ. ID. NO. 62 musculusYDSSLRPLSI KYDPASAKIISNSGHSFNVD FDDTEDKSVL RGGPLTGNYR LRQFHLHWGSADDHGSEHVV DGVRYAAELH VVHWNSDKYP SFVEAAHESD GLAVLGVFLQ IGEHNPQLQKITDILDSIKE KGKQTRFTNFDPLCLLPSSW DYWTYPGSLT VPPLLESVTW IVLKQPISISSQQLARFRSL LCTAEGESAA FLLSNHRPPQ PLKGRRVRAS FY Canis MPPRRHGPNTFLSAGTKGQQ NFWTKNQKSG PIHWNKFFPI XP_544159 SEQ. ID. NO. 63familiaris ADGDQQSPIE IKTKEVKYDS SLRPLSIKYD ANSAKIISNS GHSFSVDFDDTEDKSVLRGG PLTGSYRLRQ FHLHWGSADD HGSEHVVDGV RYAAELHVVH WNSDKYPSFVEAAHEPDGLA VLGVFLQIGE HNSQLQKITD ILDSIKEKGK QTRFTNFDPL SLLPPSWDYWTYPGSLTVPP LLESVTWIVL KQPINISSQQ LATFRTLLCT AEGEAAAFLL SNHRPPQPLKGRKVRASFH Eguus MSGPVHWNEF FPIADGDQQS PIEIKTKEVK YDSSLRPLTIXP_001489984.2 SEQ. ID. NO. 64 caballus KYDPSSAKII SNSGHSFSVGFDDTENKSVL RGGPLTGSYR LRQFHLHWGS ADDHGSEHVV DGVRYAAELH IVHWNSDKYPSFVEAAHEPD GLAVLGVFLQ VGEHNSQLQK ITDTLDSIKE KGKQTLFTNF DPLSLLPPSWDYWTYPGSLT VPPLLESVTW IILKQPINIS SQQLVKFRTL LCTAEGETAA FLLSNHRPPQPLKGRKVRAS FR Bos taurus MSGFSWGYGE RDGPVHWNEF FPIADGDQQSPIEIKTKEVR XP_002692875.1 SEQ. ID. NO. 65 YDSSLRPLGI KYDASSAKIISNSGHSFNVD FDDTDDKSVL RGGPLTGSYR LRQFHLHWGS TDDHGSEHVV DGVRYAAELHVVHWNSDKYP SFVEAAHEPD GLAVLGIFLQ IGEHNPQLQK ITDILDSIKE KGKQTRFTNFDPVCLLPPCR DYWTYPGSLT VPPLLESVTW IILKQPINIS SQQLAAFRTL LCSREGETAAFLLSNHRPPQ PLKGRKVRAS FR Monodelphis MSRLSWGYCE HNGPVHWSELFPIADGDYQS PIEINTKEVK XP_001366749.1 SEQ. ID. NO. 66 domesticaYDSSLRPLSI KYDPASAKII SNSGHSFSVD FDDSEDKSVL RGGPLIGTYR LRQFHLHWGSTDDQGSEHTV DGMKYAAELH VVHWNSDKYP SFVEAAHEPD GLAVLGIFLQ TGEHNLQMQKITDILDSIKE KGKQIRFTNF DPATLLPQSW DYWTYPGSLT VPPLLESVTW IVLKQPITISSQQLAKFRSL LYTGEGEAAA FLLSNYRPPQ PLKGRKVRAS FR OrnithorhynchusMKKGVGSFYE LAVNRWSVVN RVQIMIVESI TEPLLCGSRA XP_001507177.1 SEQ. ID.NO. 67 anatinus LALTLSPTQA LAVAPALALA VVQALALTVV QALALAVSPALALSVAPALA LAVVQALALA VVQALALAVA QALALAVAQA LALAVAQALA LALPQALALTLPQALALTLS PTLALSVAPA LALAVAPALA LADSPALALA LARPHPSSGS SPALDCELVLFGDCHTVLLK WMRMGNYSSV SPLEERNSSC PLGPIHWNEL FPIADGDRQS PIEIKTKEVKYDSSLRPLSI KYDPTSAKII SNSGHSFSVD FDDTEDKSVL RGGPLSGTYR LRQFHFHWGSADDHGSEHTV DGMEYSAELH VVHWNSDKYS SFVEAAHEPD GLAVLGIFLK RGEHNLQLQKITDILDAIKE KGKQMRFTNF DPLSLLPLTR DYWTYPGSLT VPPLLESVIW IIFKQPISISSQQLAKFRNL LYTAEGEAAD FMLSNHRPPQ PLKGRKVRAS FRS
TABLE-US-00007 TABLE D5 Exemplary CA II DNA expression constructsfor chloroplast expression ATGTCCCATC ACTGGGGGTA CGGCAAACACAACGGACCTG AGCACTGGCA TAAGGACTTC SEQ. ID. NO. 94 CCCATTGCCAAGGGAGAGCG CCAGTCCCCT GTTGACATCG ACACTCATAC AGCCAAGTAT (human cDNAGACCCTTCCC TGAAGCCCCT GTCTGTTTCC TATGATCAAG CAACTTCCCT GAGGATCCTCsequence) AACAATGGTC ATGCTTTCAA CGTGGAGTTT GATGACTCTC AGGACAAAGCAGTGCTCAAG GGAGGACCCC TGGATGGCAC TTACAGATTG ATTCAGTTTC ACTTTCACTGGGGTTCACTT GATGGACAAG GTTCAGAGCA TACTGTGGAT AAAAAGAAAT ATGCTGCAGAACTTCACTTG GTTCACTGGA ACACCAAATA TGGGGATTTT GGGAAAGCTG TGCAGCAACCTGATGGACTG GCCGTTCTAG GTATTTTTTT GAAGGTTGGC AGCGCTAAAC CGGGCCTTCAGAAAGTTGTT GATGTGCTGG ATTCCATTAA AACAAAGGGC AAGAGTGCTG ACTTCACTAACTTCGATCCT CGTGGCCTCC TTCCTGAATC CTTGGATTAC TGGACCTACC CAGGCTCACTGACCACCCCT CCTCTTCTGG AATGTGTGAC CTGGATTGTG CTCAAGGAAC CCATCAGCGTCAGCAGCGAG CAGGTGTTGA AATTCCGTAA ACTTAACTTC AATGGGGAGG GTGAACCCGAAGAACTGATG GTGGACAACT GGCGCCCAGC TCAGCCACTG AAGAACAGGC AAATCAAAGCTTCCTTCAAA TAAgaattcATGTCtCATCAtTGGGGtTAtGGtAAACACAAtGGtCCTGAaCACTGGCATAAaGACTTSEQ. ID. NO. 108tCCaATTGCaAAaGGtGAaCGtCAaTCaCCTGTTGAtATtGACACTCATACAGCtAAaTATGACC(Optimiz- ed forCTTCttTaAAaCCatTaTCTGTTTCaTATGATCAAGCAACTTCttTacGtATttTaAACAATGGTchloropl- astCATGCTTTtAAtGTaGAaTTTGATGACTCTCAaGAtAAAGCAGTatTaAAaGGtGGtCCatTaGAExpressi- on)TGGtACTTACcGtTTaATTCAaTTTCACTTTCACTGGGGTTCAtTaGATGGtCAAGGTTCAGAaCATACTGTaGATAAAAAaAAATATGCTGCAGAAtTaCACTTaGTTCACTGGAACACaAAATATGGtGATTTTGGtAAAGCTGTaCAaCAACCTGATGGttTaGCtGTTtTAGGTATTTTTTTaAAaGTTGGtAGtGCTAAACCaGGtCTTCAaAAAGTTGTTGATGTatTaGATTCaATTAAAACAAAaGGtAAaAGTGCTGACTTtACTAAtTTCGATCCTCGTGGttTaCTTCCTGAATCtTTaGATTACTGGACaTAtCCAGGtTCAtTaACaACaCCTCCTCTTtTaGAATGTGTaACaTGGATTGTatTaAAaGAACCaATtAGtGTaAGtAGtGAaCAaGTaTTaAAATTCCGTAAACTTAAtTTCAATGGtGAaGGTGAACCaGAAGAAtTaATGGTtGAtAACTGGCGtCCAGCTCAaCCAtTaAAaAAtcGtCAAATtAAAGCTTCaTTCAAATAAgcatgc
TABLE-US-00008 TABLE D6 Codons in Human CA II optimized forexpression in chloroplast of Chlamydomonas reinhardtii Number ofcodons No. of amino Expected Amino Total that were acids of ratioacid number optimized each codon of codons Ser(S) 18 12 TCT TCA1:1:1 AGT (7:7:5) Phe(F) 12 3 TIT TTC (8:4) 2:1 Leu(L) 26 19 TIACTT (21:5) 5:1 Val(V) 17 10 GTT GTA (8:9) 1:1 Pro(P) 17 6 CCT CCA(8:9) 3:4 Thr(T) 12 5 ACT ACA (5:7) 2:3 Ala(A) 13 3 GCT GCA (9:4)2:1 Tyr(Y) 8 2 TAT TAC (6:2) 2:1 His(H) 12 1 CAT CAC (6:6) 1:1Asn(N) 10 4 AAT AAC (7:3) 2.5 1 A(D) 19 3 GAT GAC (14:5) 2.5 1Ile(I) 9 4 ATT (9) 1 Met(M) 2 0 ATG (2) 1 Gln(Q) 11 7 CAA (11) 1Glu(E) 13 6 GAA (13) 1 Lys(K) 24 11 AAA (24) 1 Cys(C) 1 0 TGT (1) 1Tf.English Pound._(W) 7 0 TGG (7) 1 Gly(G) 22 17 GGT (22) 1 Arg(R)7 5 CGT (7) 1
TABLE-US-00009 TABLE D7 Exemplary algal bicarbonate transportertypes Transport Substrate Photosynthetic Type Mechanism affinityFlux rate affinity ko.6 BicA Na+ Low- High 90-170 .mu.M dependentmedium HC0.sub.3- SbtA Na+ High Low <5 .mu.M HC0.sub.3 dependentHC0.sub.3- uptake BicA Na+ Low- High 90-170 .mu.M dependent mediumHC0.sub.3- SbtA Na+ High Low <5 .mu.M HC0.sub.3 dependentHC0.sub.3- uptake
TABLE-US-00010 TABLE D8 Exemplary plasma membrane localizedBicarbonate transporters Accession Organism Sequence Number SEQ.ID. NO Chlamydomonas MLPGLGVILL VLPMQYYFGY KIVQIKLQNA KHVALRSAIMEDP07736.1 SEQ. ID. NO. 77 reinhardtii QEVLPAIKLV KYYAWEQFFENQISKVRREE IRLNFWNCVM KVINVACVFC VPPMTAFVIF TTYEFQRARL VSSVAFTTLSLFNILRFPLV VLPKALRAVS EANASLQRLE AYLLEEVPSG TAAVKTPKNA PPGAVIENGVFHHPSNPNWH LHVPKFEVKP GQVVAVVGRI AAGKSSLVQA ILGNMVKEHG SFNVGGRISYVPQNPWLQNL SLRDNVLFGE QFDENKYTDV IESCALTLDL QILSNGDQSK AGIRGVNFSGGQRQRVNLAR CAYADADLVL LDNALSAVDH HTAHHIFDKC IKGLFSDKAV VLVTHQIEFMPRCDNVAIMD EGRCLYFGKW NEEAQHLLGK LLPITHLLHA AGSQEAPPAP KKKAEDKAGPQKSQSLQLTL APTSIGKPTE KPKDVQKLTA YQAALIYTWY GNLFLVGVCF FFFLAAQCSRQISDFWVRWW VNDEYKKFPV KGEQDSAATT FYCLIYLLLV GLFYIFMIFR GATFLWWVLKSSETIRRKAL HNVLNAPMGF FLVTPVGDLL LNFTKDQDIM DENLPDAVHF MGIYGLILLATTITVSVTIN FFAAFTGALI IMTLIMLSIY LPAATALKKA RAVSGGMLVG LVAEVLEGLGVVQAFNKQEY FIEEAARRTN ITNSAVFNAE ALNLWLAFWC DFIGACLVGV VSAFAVGMAKDLGGATVGLA FSNIIQMLVF YTWVVRFISE SISLFNSVEG MAYLADYVPH DGVFYDQRQKDGVAKQIVLP DGNIVPAASK VQVVVDDAAL ARWPATGNIR FEDVWMQYRL DAPWALKGVTFKINDGEKVG AVGRTGSGKS TTLLALYRMF ELGKGRILVD GVDIATLSLK RLRTGLSIIPQEPVMFTGTV RSNLDPFGEF KDDAILWEVL KKVGLEDQAQ HAGGLDGQVD GTGGKAWSLGQMQLVCLARA ALRAVPILCL DEATAAMDPH TEAIVQQTIK KVFDDRTTIT IAHRLDTIIESLMEYESPSK LLANRDSMFS KLVDKTGPAA AAALRKMAED FWSTRSAQGR NQ Volvoxcarteri MGTISHPARG NDPTAGFFNK EAFGWMFKHV SEARKNGDID XP_002950646.1SEQ. ID. NO. 69 f. nagariensis LDKMGMPPEN HAHEAYDMFA SNWAAEMKLKDSGAKPSLVR ALRKSFGLVY LLGGVFKCFW STFVITGAFY FVRSLLAHVN GIKDGRLYSKTVSGWCLMAG FTLDAWLLGL SLQRMGYICM SVGIRARAAL VQAVTHKAFR LSSVRADQSAAIVNFVSSDI QKIYDGALEF HYLWTAPFEA AAILALLGYL TNDSMLPGLG VILLVLPLQYFFGYKIIQIK LQNAKHVALR SSILQEVLPA IKLVKYYAWE QFFEDEISKI RREEMRLSFWNAMMKVINVA CVFCVPPMTA FVIFTTYEFQ KARLVSGVAF TTLSLFNILR FPLVVLPKALRAVSEAHASL QRLESYLLED VPQGTASGGK SSKSSAPGVH IDNAVYHHPS NPNWHLHVPRFDVRPGQVVA VVGRIGAGKS SLVQAILGNM VKEHGSQQVG GRISYVPQNP WLQNLSIRDNVTFGEGWDEN KYEAVIDACA LTMDLQILPQ GDQSKAGIRG VNFSGGQRQR VNLARCAYADADLVLLDNAL SAVDHHTAHH IFDKCIKGLF SDKAVVLITH QIEFMPRCDA VAIMDEGRCLYFGKWNEESQ HLLGKLLPIT HLLHAAGSQE APPAAPKKKD DKATPQKSQS LQLTLAPTSIGKPTQKDTKA APKLTAFKAA LIYTYYGNIL LVFVCFITFL AAQTCRQMSD FWVRWWVNDEYKHFPKRTGV REESATKFYA LIYLLLVGLF YFTMVARGST FLWWVLRSSE NIRKKALNNVLNAPMGFFLV TPVGDLLLNF TKDQDIMDEN LPDAIHFMGI YGLILLATTI TVSVTINFFGAFTGFLIIMT LIMLAIYLPA ATALKKARAV SGGQLVGLVA EVLEGLNVVQ AFSKQEYFIEEAARRTDVTN AAVFNAESLN LWLAFWCDLI GASLVGVVSA FAVGLKDQLG AATVGLAFSNIIQMLVFYTW VVRFIAESIS LFNSVEAMAW LADYVPKDGI FYDQKQLDGV AKSITLPDGQIVPATSKVQV VVDDAALARW PATGNIRFED VWMQYRLDAA WALKGVTFKI NDGEKVGAVGRTGSGKSTTL LALYRMFELG KGRILIDGVD IATLSLKRLR TGLSIIPQEP VMFTGTVRSNLDPFGEFKDD SVLWEVLQKV GLEAQAQHAG GLDGRVDGTG GKAWSLGQMQ LVCLARAALRAVPILCLDEA TAAMDPHTEQ VVQETIKKVF DDRTTITIAH RLDTIIESDK VLVMEAGELKEFAPPAQLLA NRETMFSKLV DKTGPAAAAA LRKMADEHFS KSQARAAAQR H ChlorellaMVPLLAQRGR IRSQAPRTWH PDPQPLHAER SRQCPGRGVR EFN52914.1 SEQ. ID. NO.70 variabilis AAAKRGGGSG GATHKSKKSK ELDEVAAFEQ LMCDWDDAFAADCYDNERAA RMARLAEEGY QHHGRGFVFV RSRLDKRSRK ARNDSGASKG FGAAAKALSVEQGTPLENNP QLHLLSWTAC YIASSQLDSL GGLFSTQEGV LLPDSGSLLT DGGSGASGSNAADAVGELQR VLRGQDLSQL RGYVGAPPQA RPASGSDDDG SSTTGSNNGA AGEGSEVEEGTAMGGIRRYE PESGELVVLL SCKIGGKPAV GAELLAVAQA EDGKHAPGAS PDTRLCKEPSQSAFDLWSFG WMNKIVPAAR RGEVEVADLP LPEAQQAEPC YEELNTNWEA AVQEAKKAGKEPKLMKVLWK TYGKDIVLAG IFKLMWSVFV ILGAYYFTRS ILMCIRTLEG KDDSIYDTEWKGWVLTGFFF LDAWLLGMML QRMAFNCLKV GIKARAALTT MIARKCYNMA HLTKDTAAEAVGFVASDINK VFEGIQEVHY LWGAPVEAGA ILALLGTLVG VYCIGGVIIV CMVVPLQYYFGYKIIKNKIK NAPNVTERWS IIQEILPAMK LVKYYAWERF FEKHVADMRT RERHYMFWNAVVKTVNVTMV FGVPPMVTFA VLVPYELWHV DSSTSEPYIK PQTAFTMLSL FNVLRFPLVVLPKAMRCVSE ALRSVGNLEK FLAEPVAPRQ DLEGKPGAQL SKAVLRHEMD TSGFTLRVPEFSVKAGELVA VVGRVGAGKS SILQAMLGNM QTASGLAKCQ HSASSCLPFL VEGTAHSGGRIAYVPQTAWC QNLSLRDNIT FGQPWDEAKY KQVIHACALE LDLAILAAGD QSKAGLRGINLSGGQRQRLN LARCAYFDGD LVLLDNALSA VDHHTAHHIF EHCVRGMFRD KATVLVTHQVEFLPQCDKVA IMDDGTCVYF GPWNAAAQQL LSKYLPASHL LAAGGNAEQP RDTKKKVVKKEETKKTEDAG KAKRVHSASL TLKSALWEYC WDARWIIFCL SLFFFLTAQA SRQLADYFIRWWTRDHYNKY GVLCIDEGDN PCGPLFYVQY YGILGLLCFI VLMAFRGAFL YTWSLGASYRQHEKSIHRVL YAPLGFFLTT PVGDLLVSFT KDQDVMDDAL PDALYYAGIY GLILLATAITVSVTIPLFSA LAGGLFVVSG IMLAIYLPAA THLKKLRMGT SGDVVTLIAE ALDGLGVIQAYGKQAYFTTI TSQYVNDAHR ALFGAESLNL WLAFICDFFG ACMVLSVACF GIGQWSTLGSSSVGLAFSQS IQMLVFYTWS IRLVAECIGL FGSAEKIAWL ANHTPQEAGS LDPPSLPGSGETKAAPKKRG TAGKFLPPLK DEDLAIVPTG GPKLPSGWPR TGVLEFNQVV MKYAPHLPPALRGVSFKVKS GDKVGVVGRT GSGKSTLLLA LYRMFNLESG AITLDGIDIS TLTLEQLRRGLSVIPQEPTV FSGTVRTNLD PFGEFGADAI LWEALRDCGL EEQVKACGGL DAKLDGTGGNAWSIGQQQLM CLARAALKKV PVLCLDEATA AMDPHTEAHV LEIIERIFSD RTMLTIAHRLDNVIRSDLVV VMDAGQVCEM GTPDELLANP_QSAFSQLVDK TGAASAAALR KMAADFLDERARGQKLGFKP RPSLEESHIC VAPSPSLILS TLLFPPAFMA NVTALLLPKP VLSHAPVSSQTVNTYIRLNI IQLQCNVLHP ATKEATWSSR RITFTAHLSS SGSKPPPPLP PLTELPEGRGLDWSSAGYRD GREAIPSPSA KYSAADYGAA GDGVTDDTQA LQVAVAAAHE DDEGGVVYLGAGTFVLTQPL SIAGSNVVIR GAGEDATTIF VPLPLSDVFP GTWSMDASGK VTSPWITRGGFLAFSGRRTK SSDSSTLLAT VAGSVEQGAS VIPVDSTAEF RLGQWVRIII NDASTDASAGGGTLERGSSE VQESETMIAE GATGGGAGVR AQWTGVLHAF EPTVQCSGVE QLTIRFNHSMMAAHLAERGY NAIELEDVVD CWIRQVTILN ADNAIRLRGT DHSTLSGQAC SGGGVVAVVPVWCRRGLPSP ADVTVGVTEL RWEPDTREVN GHHAITVSKG HANLVTRFRI TAPFYHDISLEGGALLNVIS SGGGANLNLD LHRSGPWGNL FSQLGMGLAA RPFDAGGRDG RGAHAGRQNTFWNLQPGDVA AAAPALQPSA AAGDARRLLV DGDSLLHAGT GQARLLRQLE ADDSAEPLLLPSCEFGPLLN FVGGFAGELC KSSGWLVAGL PDDRPDLHAS QVTARLQHGA ADNKTHASynechococcus MDFLSNFLMD FVKQLQSPTL SFLIGGMVIA ACGSQLQIPEABB57505.1 SEQ. ID. No. 71 elongatus SICKIIVFML LTKIGLTGGMAIRNSNLTEM VLPALFSVAI PCC 7942.J GILIVFIARY TLARMPKVKT VDAIATGGLFGAVSGSTMAA ALTLLEEQKI PYEAWAGALY PFMDIPALVT AIVVANIYLN KKKRKEAAFASAQGAYSKQP VAAGDYSSSS DYPSSRREYA QQESGDHRVK IWPIVEESLQ GPALSAMLLGVALGLFARPE SVYEGFYDPL FRGLLSILML VMGMEAWSRI SELRKVAQWY VVYSIVAPLAHGFIAFGLGM IAHYATGFSM GGVVVLAVIA ASSSDISGPP TLRAGIPSAN PSAYIGASTAIGTPVAIGIA IPLFLGLAQT IGG Synechocystis MDFLSNFLTD FVGQLQSPTLAFLIGGMVIA ALGTQLVIPE NP_441340 SEQ. ID. No.72 sp. AISTIIVFMLLTKIGLTGGM AIRNSNLTEM LLPVAFSVIL PCC 6803 GILIVFIARF TLAKLPNVRTVDALATGGLF GAVSGSTMAA ALTTLEESKI SYEAWAGALY PFMDIPALVT AIVVANIYLNKRKRKSAAAS IEESFSKQPV AAGDYGDQTD YPRTRQEYLS QQEPEDNRVK IWPIIEESLQGPALSAMLLG LALGIFTKPE SVYEGFYDPL FRGLLSILML IMGMEAWSRI GELRKVAQWYVVYSLIAPIV HGFIAFGLGM IAHYATGFSL GGVVVLAVIA ASSSDISGPP TLRAGIPSANPSAYIGSSTA IGTPIAIGVC IPLFIGLAQT LGAG Nostoc sp. MDFFSLFLMDFVKQLQSPTL GFLIGGMVIA ALGSELIIPE NP_486174 SEQ. ID. No. 73 PCC 712AICQIIVFML LTKIGLTGGI AIRNSNLTEM VLPAASAVAV GVLVVFIARY TLAKLPKVNTVDAIATGGLF GAVSGSTMAA ALTLLEEQKI QYEAWAAALY PFMDIPALVT AIVVANIYLNKKKRSAAGEY LSKQSVAAGE YPDQQDYPSS RQEYLRKQQS ADNRVKIWPI VKESLQGPALSAMLLGIALG LFTQPESVYK SFYDPLFRGL LSILMLVMGM EAWSRIGELR KVAQWYVVYSVVAPLVHGFI AFGLGMIAHY ATGFSLGGVV ILAVIAASSS DISGPPTLRA GIPSANPSAYIGASTAIGTP IAIGLAIPLF LGLAQAIGGR Cyanothece sp. MDFWSYFLMDFVKQLQSPTL GFLIGGMVIA ALGSQLVIPE YP_002485721 SEQ. ID. No. 74 PCC7425 AICQIIVFML LTKIGLTGGM AIRNSNLTEM VLPAAFSVIS GILIVFIARYTLAKLPKVRT VDAIATGGLF GAVSGSTMAA ALTLLEEEKI PYEAWAGALY PFMDIPALVTAIVIANIYLN KKKRRAESEA LSKQEYLGKQ SIVAGDYPAQ QDYPSTRQEY LSKQQGPENNRVKIWPIVQE SLQGPALSAM LLGVALGILT KPESVYESFY DPLFRGLLSI LMLVMGMEAWSRIGELRKVA QWYVVYSVVA PFVHGLIAFG LGMFAHYTMG FSMGGVVVLA VIASSSSDISGPPTLRAGIP SANPSAYIGA STAIGTPIAI GLCIPFFIGL AQTLGGG MicrocystiMDFFSLFVMD FIQQLQSPTL AFLIGGMIIA ALGSELVIPE YP_001661223 SEQ. ID.No. 75 aeruginosa SICTIIVFML LTKIGLTGGI AIRNSNLTEM VLPMIFAVIVNIES-843 GIIVVFVARY TLANLPKVKV VDAIATGGLF GAVSGSTMAA GLTVLEEQKIPYEAWAGALY PFMDIPALVT AIVVANIYLN KKKQKEAAYD QESFSKQPVA AGNYSDQQDYPSSRQEYLSQ QQPADNRVKI WPIIEESLRG PALSAMLLGL ALGIFTQPES VYKSFYDPLFRGLLSVLMLV MGMEAWSRVG ELRKVAQWYV VYSVIAPFVH GLIAFGLGMI AHYATGFSWGGVVMLAVIAS SSSDISGPPT LRAGIPSANP SAYIGASTAI GTPVAIGLCI PFFVGLAQALSGG Anabaena MDFVSLFVKD FIAQLQSPTL AFLIGGMIIA ALGSELVIPE YP_323532SEQ. ID. No. 86 variabills SICTIIVFML LTKIGLTGGI AIRNSNLTEMVLPMIFAVIT ATCC 29413 GITIVFISRY TLAKLPKVKV VDAIATGGLF GAVSGSTMAAGLTVLEEQKM AYEAWAGALY PFMDIPALVT AIVIANIYLN KKKRKEAVYS TEQPVAAGDYPDQKDYPSSR QEYLSQQKGD EDNRVKIWPI IEESLRGPAL SAMLLGLALG LFTQPESVYKSFYDPAFRGL LSILMLVMGM EAWSRIGELR KVAQWYVVYS VVAPFVHGLI AFGLGMIAHYTMNFSMGGVV ILAVIASSSS DISGPPTLRA GIPSANPSAY IGASTAVGTP VAIGLCIPFFLGLAQAIGG Cyanothece sp. MDFLSLFVKD FIIQLQSPTL AFLIGGMVIAALGSELVIPE YP_002371470.1 SEQ. ID. No. 87 PCC 880 SICTIIVFMLLTKIGLTGGI AIRNSNLTEM VLPMICAVIV GIVVVFIARY TLAKLPKVNV VDAIATGGLFGAVSGSTMAA GLTVLEEQKI PYEAWAGALY PFMDIPALVT AIVVANIYLN KKKRKATVMQESLSKQPVAA GDYPSSRQEY VSQQQPEDNR VKIWPIIEES LRGPALSAML LGLALGILTQPESVYKGFYD PPFRGLLSIL MLVMGMEAWS RIGELRKVAQ WYVVYSVAAP FIHGLLAFGLGMIAHYTMGF SMGGVVILAV IASSSSDISG PPTLRAGIPS ANPSAYIGAS TAIGTPVAIGLCIPFFVGLA QAIGGF Arthrospia MDFLSGFLTR FLAQLQSPTL GFLIGGMVIAAVNSQLQIPD ZP.06383808.1 SEQ. ID. No. 88 platensis AIYKFVVFMLLMKVGLSGGI AIRGSNLTEM LLPAVFALVT str. GIVIVFIGRY TLAKLPNVKTVDAIATAGLF GAVSGSTMAA Paraca ALTLLEEQGM EYEAWAAALY PFMDIPALVSAIVLASIYVS KQKHSDMADE SLSKHESLSK QPVAAGDYPS KPEYPTTRQE YLSQQRGSANQGVEIWPIIK ESLQGSALSA LLLGLALGLL TRPESVFQSF YEPLFRGLLS ILMLVMGMEATARLGELRKV AQWYAVYAFI APLLHGLIAF GLGMIAHVVT GFSLGGVVIL AVIASSSSDISGPPTLRAGI PSANPSAYIG SSTAVGTPVA IALGIPLYIG LAQALMGG
TABLE-US-00011 TABLE D9 Exemplary chloroplast envelope localizedBicarbonate transporters Accession Organism Sequence Number SEQ.ID. NO MQTTMTRPCL AQPVLRSRVL RSPMRVVAAS APTAVTTVVT BAD16681.1 SEQ.ID. NO.89 Chlamydomonas SNGNGNGHFQ AATTPVPPTP APVAVSAPVR AVSVLTPPQVreinhardtii YENAINVGAY KAGLTPLATF VQGIQAGAYI AFGAFLAISV GGNIPGVAAANPGLAKLLFA LVFPVGLSMV TNCGAELFTG NTMMLTCALI EKKATWGQLL KNWSVSYFGNFVGSIAMVAA VVATGCLTTN TLPVQMATLK ANLGFTEVLS RSILCNWLVC CAVWSASAATSLPGRILALW PCITAFVAIG LEHSVANMFV IPLGMMLGAE VTWSQFFFNN LIPVTLGNTIAGVLMMAIAY SISFGSLGKS AKPATA Volvox carteri MQTTMSVTRP CVGLRPLPVRNVRSLIRAQA APQQVSTAVS XP_002951507.1 SEQ. ID. NO. 79 f. nagariensisTNGNGNGVAA ASLSVPAPVA APAQAVSTPV RAVSVLTPPQ VYENAANVGA YKASLGVLATFVQGIQAGAY IAFGAFLACS VGGNIPGITA SNPGLAKLLF ALVFPVGLSM VTNCGAELYTGNTMMLTCAI FEKKATWAQL VKNWVVSYAG NFVGSIAMVA AVVATGLMAS NQLPVNMATAKSSLGFTEVL SRSILCNWLV CCAVWSASAA TSLPGRILGL WPPITAFVAI GLEHSVANMFVIPLGMMLGA DVTWSQFFFN NLVPVTLGNT IAGVVMMAVA YSVSYGSLGK TPKPATA
TABLE-US-00012 TABLE D10 Transit Peptides Organism SEQ ID NO NameArabidopsis 8 Rbcs-1a transit thaliana peptide Arabidopsis 14 PGR5transit thaliana peptide Arabidopsis 15 psaD transit thalianapeptide Arabidopsis 22 DNAJ transit thaliana peptide Cyanophora 102psaD trasit paradoxa peptide Arabidopsis 104 CAB transit thalianapeptide Arabidopsis 105 PGR5 transit thaliana peptide
TABLE-US-00013 TABLE D11 Cyclic Electron Transfer modulatorproteins SEQ ID Accession Organism NO Name No. Function Arabidopsis93 Ferredoxin1 AEE28669.1 cyclic electron thaliana (FD1) transfermodulator protein Arabidopsis 95 Ferredoxin2 AAG40057.1 cyclicelectron thaliana (FD2) transfer modulator protein Arabidopsis 96ferredoxin- AT5G66190 cyclic electron thaliana NADP(+) partialtransfer oxidoreductase modulator (FNR1) protein Arabidopsis 97ferredoxin- BAH19611.1 cyclic electron thaliana NADP(+) transferoxidoreductase modulator (FNR2) protein
An exemplary optimized DNA sequence for the plasma membranelocalized bicarbonate transporter is shown in SEQ ID NO. 91
TABLE-US-00014 (SEQ ID NO: 91) atgctgcccg gcctgggcgt catcctgctggtgctgccca tgcagtacta cttcggctac 60 aagatcgtgc agatcaagctgcagaacgcc aagcacgtcg ccctgcgctc cgccatcatg 120 caggaggtgctgcccgccat caagctggtc aagtactacg cctgggagca gttctttgag 180aaccagatca gcaaggtccg ccgcgaggag atccgcctca acttctggaa ctgcgtgatg240 aaggtcatca acgtggcctg cgtgttctgc gtgccgccca tgaccgccttcgtcatcttc 300 accacctacg agttccagcg cgcccgcctg gtgtccagcgtcgccttcac caccctgtcg 360 ctgttcaaca ttctgcgctt ccccctggtcgtgctgccca aggccctgcg tgccgtgtcc 420 gaggccaacg cgtctctccagcgcctggag gcctacctgc tggaggaggt gccctcgggc 480 actgccgccgtcaagacccc caagaacgct ccccccggcg ccgtcatcga gaacggtgtg 540ttccaccacc cctccaaccc caactggcac ctgcacgtgc ccaagttcga ggtcaagccc600 ggccaggtcg ttgctgtggt gggccgcatc gccgccggca agtcgtccctggtgcaggcc 660 atcctcggca acatggtcaa ggagcacggc agcttcaacgtgggcggccg catctcctac 720 gtgccgcaga acccctggct gcagaacctgtccctgcgtg acaacgtgct gtttggcgag 780 cagttcgatg agaacaagtacaccgacgtc atcgagtcct gcgccctgac cctggacctg 840 cagatcctgtccaacggtga ccagtccaag gccggcatcc gcggtgtcaa cttctccggt 900ggccagcgcc agcgcgtgaa cctggcccgc tgcgcctacg ccgacgccga cctggtgctg960 ctcgacaacg ccctgtccgc cgtggaccac cacaccgccc accacatcttcgacaagtgc 1020 atcaagggcc tgttctccga caaggccgtg gtgctggtcacccaccagat cgagttcatg 1080 ccccgctgcg acaacgtggc catcatggacgagggccgct gcctgtactt cggcaagtgg 1140 aacgaggagg cccagcacctgctcggcaag ctgctgccca tcacccacct gctgcacgcc 1200 gccggctcccaggaggctcc ccccgccccc aagaagaagg ccgaggacaa ggccggcccc 1260cagaagtcgc agtcgctgca gctgaccctg gcccccacct ccatcggcaa gcccaccgag1320 aagcccaagg acgtccagaa gctgactgcc taccaggccg ccctcatctacacctggtac 1380 ggcaacctgt tcctggttgg cgtgtgcttc ttcttcttcctggcggctca gtgctctcgc 1440 cagatctccg atttctgggt gcgctggtgggtgaacgacg agtacaagaa gttccccgtg 1500 aagggcgagc aggactcggccgccaccacc ttctactgcc tcatctacct gctgctggtg 1560 ggcctgttctacatcttcat gatcttccgc ggcgccactt tcctgtggtg ggtgctcaag 1620tcctcggaga ccatccgcag gaaggccctg cacaacgtcc tcaacgcgcc catgggcttc1680 ttcctggtca cgccggtcgg cgacctgctg ctcaacttca ccaaggaccaggacattatg 1740 gatgagaacc tgcccgatgc cgttcacttc atgggcatctacggcctgat tctgctggcg 1800 accaccatca ccgtgtccgt caccatcaacttcttcgccg ccttcaccgg cgcgctgatc 1860 atcatgaccc tcatcatgctctccatctac ctgcccgccg ccactgccct gaagaaggcg 1920 cgcgccgtgtctggcggcat gctggtcggc ctggttgccg aggttctgga gggccttggc 1980gtggttcagg ccttcaacaa gcaggagtac ttcattgagg aggccgcccg ccgcaccaac2040 atcaccaact ccgccgtctt caacgccgag gcgctgaacc tgtggctggctttctggtgc 2100 gacttcatcg gcgcctgcct ggtgggcgtg gtgtccgccttcgccgtggg catggccaag 2160 gacctgggcg gcgcgaccgt cggcctggccttctccaaca tcattcagat gcttgtgttc 2220 tacacctggg tggtccgcttcatctccgag tccatctccc tcttcaactc cgtcgagggc 2280 atggcctacctcgccgacta cgtgccccac gatggtgtct tctatgacca gcgccagaag 2340gacggcgtcg ccaagcaaat cgtcctgccc gacggcaaca tcgtgcccgc cgcctccaag2400 gtccaggtcg tggttgacga cgccgccctc gcccgctggc ctgccaccggcaacatccgc 2460 ttcgaggacg tgtggatgca gtaccgcctg gacgctccttgggctctgaa gggcgtcacc 2520 ttcaagatca acgacggcga gaaggtcggcgccgtgggcc gcaccggctc cggcaagtcc 2580 accacgctgc tggcgctgtaccgcatgttc gagctgggca agggccgcat cctggtcgac 2640 ggcgtggacatcgccaccct gtcgctcaag cgcctgcgca ccggcctgtc catcattccc 2700caggagcccg tcatgttcac cggcaccgtg cgctccaacc tggacccctt cggcgagttc2760 aaggacgatg ccattctgtg ggaggtgctg aagaaggtcg gcctcgaggaccaggcgcag 2820 cacgccggcg gcctggacgg ccaggtcgat ggcaccggcggcaaggcctg gtctctgggc 2880 cagatgcagc tggtgtgcct ggctcgcgccgccctgcgcg ccgtgcccat cctgtgcctg 2940 gacgaggcta ccgccgccatggacccgcac actgaggcca tcgtgcagca gaccatcaag 3000 aaggtgttcgacgaccgcac caccatcacc attgcccacc gcctggacac catcatcgag 3060tccgacaaga tcatcgtgat ggagcagggc tcgctgatgg agtacgagtc gccctcgaag3120 ctgctcgcca accgcgactc catgttctcc aagctggtcg acaagaccggccccgccgcc 3180 gccgctgcgc tgcgcaagat ggccgaggac ttctggtccactcgctccgc gcagggccgc 3240 aaccagtaa
An exemplary optimized DNA sequence for Chloroplast envelopelocalized Bicarbonate transporter is shown in SEQ ID NO: 92
TABLE-US-00015 (SEQ ID NO: 92) atgcagacca ctatgactcg cccttgccttgcccagcccg tgctgcgatc tcgtgtgctc 60 cggtcgccta tgcgggtggttgcagcgagc gctcctaccg cggtgacgac agtcgtgacc 120 tcgaatggaaatggcaacgg tcatttccaa gctgctacta cgcccgtgcc ccctactccc 180gctcccgtcg ctgtttccgc gcctgtgcgc gctgtgtcgg tgctgactcc tcctcaagtg240 tatgagaacg ccattaatgt tggcgcctac aaggccgggc taacgcctctggcaacgttt 300 gtccagggca tccaagccgg tgcctacatt gcgttcggcgccttcctcgc catctccgtg 360 ggaggcaaca tccccggcgt cgccgccgccaaccccggcc tggccaagct gctatttgct 420 ctggtgttcc ccgtgggtctgtccatggtg accaactgcg gcgccgagct gttcacgggc 480 aacaccatgatgctcacatg cgcgctcatc gagaagaagg ccacttgggg gcagcttctg 540aagaactgga gcgtgtccta cttcggcaac ttcgtgggct ccatcgccat ggtcgccgcc600 gtggtggcca ccggctgcct gaccaccaac accctgcctg tgcagatggccaccctcaag 660 gccaacctgg gcttcaccga ggtgctgtcg cgctccatcctgtgcaactg gctggtgtgc 720 tgcgccgtgt ggtccgcctc cgccgccacctcgctgcccg gccgcatcct ggcgctgtgg 780 ccctgcatca ccgccttcgtggccatcggc ctggagcact ccgtcgccaa catgttcgtg 840 attcctctgggcatgatgct gggcgctgag gtcacgtgga gccagttctt tttcaacaac 900ctgatccccg tcaccctggg caacaccatt gctggcgttc tcatgatggc catcgcctac960 tccatctcgt tcggctccct cggcaagtcc gccaagcccg ccaccgcg 1008
Those skilled in the art will recognize, or be able to ascertainusing no more than routine experimentation, many equivalents to thespecific embodiments of the disclosure specifically describedherein. For example a transgenic plant or alga of an embodimentdisclosed herein further comprising within its genome, andexpressing or overexpressing, a combination of heterologousnucleotide sequences encoding additionally a Rubisco (for exampleSEQ ID NO:107). Further still a transit peptide amino acid sequenceat the amine terminal portion of a protein sequence identifiedherein may be cleaved leaving the protein sequence alone. Thepercent homology applies to the protein sequence without thetransit peptide sequence also. Such equivalents are intended to beencompassed within the scope of the following claims.
REFERENCES CITED
1. Hausler R E, Hirsch H J, Kreuzaler F, Peterhansel C (2002)Overexpression of C(4)-cycle enzymes in transgenic C(3) plants: abiotechnological approach to improve C(3)-photosynthesis. J Exp Bot53: 591-607. 2. Goldschmidt E E, Huber S C (1992) Regulation ofphotosynthesis by end-product accumulation in leaves of plantsstoring starch, sucrose, and hexose sugars. Plant Physiol 99:1443-1448. 3. Duanmu D, Miller A R, Horken K M, Weeks D P, SpaldingM H (2009) Knockdown of limiting-CO.sub.2-induced gene HLA3decreases HCO3-transport and photosynthetic Ci affinity inChlamydomonas reinhardtii. Proc Natl Acad Sci USA 106: 5990-5995.4. Moroney J V, Jungnick N, Dimario R J, Longstreth D J (2013)Photorespiration and carbon concentrating mechanisms: twoadaptations to high 02, low CO.sub.2 conditions. Photosynth Res117: 121-131. 5. Wang Y, Duanmu D, Spalding M H (2011) Carbondioxide concentrating mechanism in Chlamydomonas reinhardtii:inorganic carbon transport and CO.sub.2 recapture. Photosynth Res109: 115-122. 6. Perrine Z, Negi S, Sayre R (2012) Optimization ofphotosynthetic light energy utilization by microalgae. AlgalResearch 1: 134-142. 7. Elleby B, Chirica L C, Tu C, Zeppezauer M,Lindskog S (2003) Characterization of carbonic anhydrase fromNeisseria gonorrhoeae. Eur J Biochem 286: 1613-1619. 8. SubramanianS, Barry A N, Pieris S, Sayre R T (2013) Comparative energetics andkinetics of autotrophic lipid and starch metabolism in chlorophyticmicroalgae: implications for biomass and biofuel production.Biotechnol Biofuels 6: 150. 9. Nakamura N, Iwano M, Havaux M,Yokota A, Munekage Y N (2013) Promotion of cyclic electrontransport around photosystem I during the evolution of NADP-malicenzyme-type C4 photosynthesis in the genus Flaveria. New Phytol199: 832-842. 10. Kramer D M, Evans J R (2011) The importance ofenergy balance in improving photosynthetic productivity. PlantPhysiol 155: 70-78. 11. Alric J (2010) Cyclic electron flow aroundphotosystem I in unicellular green algae. Photosynth Res 106:47-56. 12. Amunts A, Drory O, Nelson N (2007) The structure of aplant photosystem I supercomplex at 3.4 A resolution. Nature 447:58-63. 13. Breyton C, Nandha B, Johnson G N, Joliot P, Finazzi G(2006) Redox modulation of cyclic electron flow around photosystemI in C3 plants. Biochemistry 45: 13465-13475. 14. Cardol P, FortiG, Finazzi G (2011) Regulation of electron transport in microalgae.Biochim Biophys Acta 1807: 912-918. 15. Hanke G T, Okutani S,Satomi Y, Takao T, Suzuki A, et al. (2005) Multiple iso-proteins ofFNR in Arabidopsis: evidence for different contributions tochloroplast function and nitrogen assimilation. Plant Cell Environ28: 1146-1157. 16. Johnson G N (2011) Physiology of PSI cyclicelectron transport in higher plants. Biochim Biophys Acta 1807:384-389. 17. Okutani S, Hanke G T, Satomi Y, Takao T, Kurisu G, etal. (2005) Three maize leaf ferredoxin:NADPH oxidoreductases varyin subchloroplast location, expression, and interaction withferredoxin. Plant Physiol 139: 1451-1459. 18. Slewinski T L, BraunD M (2010) Current perspectives on the regulation of whole-plantcarbohydrate partitioning. Plant Science 178: 341-349. 19.Arrivault S, Guenther M, Ivakov A, Feil R, Vosloh D, et al. (2009)Use of reverse-phase liquid chromatography, linked to tandem massspectrometry, to profile the Calvin cycle and other metabolicintermediates in Arabidopsis rosettes at different carbon dioxideconcentrations. Plant Journal 59: 824-839. 20. Huege J, Sulpice R,Gibon Y, Lisec J, Koehl K, et al. (2007) GC-EI-TOF-MS analysis ofin vivo carbon-partitioning into soluble metabolite pools of higherplants by monitoring isotope dilution after (CO.sub.2)--C-13labelling. Phytochemistry 68: 2258-2272. 21. Romisch-Margl W,Schramek N, Radykewicz T, Ettenhuber C, Eylert E, et al. (2007)(CO.sub.2)--C-13 as a universal metabolic tracer in isotopologueperturbation experiments. Phytochemistry 68: 2273-2289. 22.Sekiyama Y, Kikuchi J (2007) Towards dynamic metabolic networkmeasurements by multi-dimensional NMR-based fluxomics.Phytochemistry 68: 2320-2329. 23. Szecowka M, Heise R, Tohge T,Nunes-Nesi A, Vosloh D, et al. (2013) Metabolic fluxes in anilluminated Arabidopsis rosette. Plant Cell 25: 694-714. 24. Ma F,Jazmin L J, Young J D, Allen D K (Submitted) Isotopicallynonstationary 13C flux analysis of Arabidopsis thaliana leafmetabolism at varying light intensities. Proc Natl Acad Sci USA.25. Shastri A A, Morgan J A (2007) A transient isotopic labelingmethodology for 13C metabolic flux analysis of photoautotrophicmicroorganisms. Phytochemistry 68: 2302-2312. 26. Young J D,Shastri A A, Stephanopoulos G, Morgan J A (2011) Mappingphotoautotrophic metabolism with isotopically nonstationary (13)Cflux analysis. Metab Eng 13: 656-665. 27. Young J D (Submitted)INCA: A computational platform for isotopically nonstationarymetabolic flux analysis. Bioinformatics. 28. Young J D, Walther JL, Antoniewicz M R, Yoo H, Stephanopoulos G (2008) An elementarymetabolite unit (EMU) based method of isotopically nonstationaryflux analysis. Biotechnol Bioeng 99: 686-699. 29.Masclaux-Daubresse C, Chardon F (2011) Exploring nitrogenremobilization for seed filling using natural variation inArabidopsis thaliana. J Exp Bot 62: 2131-2142. 30. Hay R K M,Gilbert R A (2001) Variation in the harvest index of tropicalmaize: Evaluation of recent evidence from Mexico and Malawi. Annalsof Applied Biology 138: 103-109. 31. Russell W A (1985) Evaluationfor plant, ear and grain traits of maize cultivars representingseven years of breeding. Maydica 30: 85-96. 32. Sinclair T R (1998)Historical changes in harvest index and crop nitrogen accumulation.Crop Science 38: 638-643. 33. Victorio R G, Moreno U, Black Jr C C(1986) Growth, partitioning, and harvest index of tuber-bearingSolanum genotypes grown in two contrasting Peruvian environments.Plant Physiology 82: 103-108. 34. Vos J (1997) The nitrogenresponse of potato (Solanum tuberosum L.) in the field: Nitrogenuptake and yield, harvest index and nitrogen concentration. PotatoResearch 40: 237-248. 35. Parry M A, Andralojc P J, Scales J C,Salvucci M E, Carmo-Silva A E, et al. (2013) Rubisco activity andregulation as targets for crop improvement. J Exp Bot 64: 717-730.36. Sage R F (2002) Variation in the k(cat) of Rubisco in C(3) andC(4) plants and some implications for photosynthetic performance athigh and low temperature. J Exp Bot 53: 609-620. 37. Henkes S,Sonnewald U, Badur R, Flachmann R, Stitt M (2001) A small decreaseof plastid transketolase activity in antisense tobaccotransformants has dramatic effects on photosynthesis andphenylpropanoid metabolism. Plant Cell 13: 535-551. 38. Miyagawa Y,Tamoi M, Shigeoka S (2001) Overexpression of a cyanobacterialfructose-1,6-/sedoheptulose-1,7-bisphosphatase in tobacco enhancesphotosynthesis and growth. Nature Biotechnology 19: 965-969. 39.Peterhansel C, Blume C, Offermann S (2013) Photorespiratorybypasses: how can they work? J Exp Bot 64: 709-715. 40. Blanco N E,Ceccoli R D, Via M V, Voss I, Segretin M E, et al. (2013)Expression of the minor isoform pea ferredoxin in tobacco altersphotosynthetic electron partitioning and enhances cyclic electronflow. Plant Physiol 161: 866-879. 41. Busch K B, Deckers-HebestreitG, Hanke G T, Mulkidjanian A Y (2012) Dynamics of bioenergeticmicrocompartments. Biol Chem 394: 163-188. 42. Minagawa J (2011)State transitions--the molecular remodeling of photosyntheticsupercomplexes that controls energy flow in the chloroplast.Biochim Biophys Acta 1807: 897-905. 43. Peltier G, Tolleter D,Billon E, Cournac L (2010) Auxiliary electron transport pathways inchloroplasts of microalgae. Photosynth Res 106: 19-31. 44. Peng L,Shikanai T (2011) Supercomplex formation with photosystem I isrequired for the stabilization of the chloroplast NADHdehydrogenase-like complex in Arabidopsis. Plant Physiol 155:1629-1639. 45. Takahashi H, Clowez S, Wollman F A, Vallon O,Rappaport F (2013) Cyclic electron flow is redox-controlled butindependent of state transition. Nat Commun 4: 1954. 46. Neale A P,Blunder T, Wunder T, Pesaresi P, Pribil M, et al. (2013) PGRL1 isthe elusive ferredoxin-plastoquinone reductase in photosyntheticcyclic electron flow. Mol Cell 49: 511-523. 47. DalCorso G,Pesaresi P, Masiero S, Aseeva E, Schunemann D, et al. (2008) Acomplex containing PGRL1 and PGR5 is involved in the switch betweenlinear and cyclic electron flow in Arabidopsis. Cell 132: 273-285.48. Shikanai T (2014) Central role of cyclic electron transportaround photosystem I in the regulation of photosynthesis. CurrentOpinion in Biotechnology 26: 25-30. 49. Walter J M, Greenfield D,Liphardt J (2010) Potential of light-harvesting proton pumps forbioenergy applications. Curr Opin Biotechnol 21: 265-270. 50.Dioumaev A K, Brown L S, Shih J, Spudich E N, Spudich J L, et al.(2002) Proton transfers in the photochemical reaction cycle ofproteorhodopsin. Biochemistry 41: 5348-5358. 51. Friedrich T,Geibel S, Kalmbach R, Chizhov I, Ataka K, et al. (2002)Proteorhodopsin is a light-driven proton pump with variablevectoriality. J Mol Biol 321: 821-838. 52. Govindjee R, Ebrey T G,Crofts A R (1980) The quantum efficiency of proton pumping by thepurple membrane of Halobacterium halobium. Biophys J 30: 231-242.53. Govindjee R, Imasheva E S, Misra S, Balashov S P, Ebrey T G, etal. (1997) Mutation of a surface residue, lysine-129, reverses theorder of proton release and uptake in bacteriorhodopsin; guanidinehydrochloride restores it. Biophys J 72: 886-898. 54. Govindjee R,Misra S, Balashov S P, Ebrey T G, Crouch R K, et al. (1996)Arginine-82 regulates the pKa of the group responsible for thelight-driven proton release in bacteriorhodopsin. Biophys J 71:1011-1023. 55. Lakatos M, Lanyi J K, Szakacs J, Varo G (2003) Thephotochemical reaction cycle of proteorhodopsin at low pH. BiophysJ 84: 3252-3256. 56. Walter J M, Greenfield D, Bustamante C,Liphardt J (2007) Light-powering Escherichia coli withproteorhodopsin. Proc Natl Acad Sci USA 104: 2408-2412. 57. Kim JY, Jo B H, Jo Y, Cha H J (2012) Improved production of biohydrogenin light-powered Escherichia coli by co-expression ofproteorhodopsin and heterologous hydrogenase. Microb Cell Fact11:2. 58. Froehlich J E, Keegstra K (2011) The role of thetransmembrane domain in determining the targeting of membraneproteins to either the inner envelope or thylakoid membrane. PlantJ 68: 844-856. 59. Beja O, Aravind L, Koonin E V, Suzuki M T, HaddA, et al. (2000) Bacterial rhodopsin: evidence for a new type ofphototrophy in the sea. Science 289: 1902-1906. 60. Lindqvist A,Andersson S (2002) Biochemical properties of purified recombinanthuman beta-carotene 15,15'-monooxygenase. J Biol Chem 277:23942-23948. 61. Roslan H A, Salter M G, Wood C D, White M R, CroftK P, et al. (2001) Characterization of the ethanol-inducible alcgene-expression system in Arabidopsis thaliana. Plant J 28:225-235. 62. Cao Y, Brown L S, Sasaki J, Maeda A, Needleman R, etal. (1995) Relationship of proton release at the extracellularsurface to deprotonation of the schiff base in thebacteriorhodopsin photocycle. Biophys J 68: 1518-1530. 63. JoliotP, Johnson G N (2011) Regulation of cyclic and linear electron flowin higher plants. Proc Natl Acad Sci USA 108: 13317-13322. 64.Fabre N, Reiter I M, Becuwe-Linka N, Genty B, Rumeau D (2007)Characterization and expression analysis of genes encoding alphaand beta carbonic anhydrases in Arabidopsis. Plant Cell Environ 30:617-629. 65. Bihmidine S, Hunter C T, 3rd, Johns C E, Koch K E,Braun D M (2013) Regulation of assimilate import into sink organs:update on molecular drivers of sink strength. Front Plant Sci 4:177. 66. Ihemere U, Arias-Garzon D, Lawrence S, Sayre R (2006)Genetic modification of cassava for enhanced starch production.Plant Biotechnol J 4: 453-465. 67. Wunsche J N, Greer D H, Laing WA, Palmer J W (2005) Physiological and biochemical leaf and treeresponses to crop load in apple. Tree Physiol 25: 1253-1263. 68.Paul M J, Foyer C H (2001) Sink regulation of photosynthesis. J ExpBot 52: 1383-1400. 69. Sonnewald U, Lerchi J, Zrenner R, Frommer W(1994) Manipulation of sink-source relations in transgenic plants.Plant Cell Environ 17: 649-658. 70. Sonnewald U, Willmitzer L(1992) Molecular approaches to sink-source interactions. PlantPhysiol 99: 1267-1270. 71. Willson W J (1972) Control of cropprocesses In: Rees A R, Cockshull K E, Hand D W, Hurd R G, editors.Crop Processes in Controlled Environments: London Academic Press.pp. 7-30. 72. Jonik C, Sonnewald U, Hajirezaei M R, Flugge U I,Ludewig F (2012) Simultaneous boosting of source and sinkcapacities doubles tuber starch yield of potato plants. PlantBiotechnol J 10: 1088-1098. 73. Sweetlove L J, Hill S A (2000)Source metabolism dominates the control of source to sink carbonflux in tuberizing potato plants throughout the diurnal cycle andunder a range of environmental conditions. Plant, Cell andEnvironment 23: 523-529. 74. Allen D K, Goldford J, Gierse J, MandyD, Diepenbrock C, et al. (2013) (submitted) Quantification ofpeptide m/z distributions form 13C-labeled cultures with highresolution mass spectrometry. Analytical Chemistry. 75. Choi J,Antoniewicz M R (2011) Tandem mass spectrometry: a novel approachfor metabolic flux analysis. Metab Eng 13: 225-233. 76. Allen D K,Libourel I G L, Shachar-Hill Y (2009) Metabolic flux analysis inplants: Coping with complexity. Plant, Cell and Environment 32:1241-1257. 77. Allen D K, Laclair R W, Ohlrogge J B, Shachar-Hill Y(2012) Isotope labelling of Rubisco subunits provides in vivoinformation on subcellular biosynthesis and exchange of amino acidsbetween compartments. Plant, Cell and Environment 35: 1232-1244.78. Allen D K, Shachar-Hill Y, Ohlrogge J B (2007)Compartment-specific labeling information in .sup.13C metabolicflux analysis of plants. Phytochemistry 68: 2197-2210. 79. Mandy D,Goldford J, Yang H, Allen D K, Libourel I G L (2013) (submitted)Metabolic flux analysis using 13C peptide label measurements. ThePlant Journal. 80. Allen D K, Young J D (2013) Carbon and nitrogenprovisions alter the metabolic flux in developing soybean embryos.Plant Physiol 161: 1458-1475. 81. Allen D K, Ohlrogge J B,Shachar-Hill Y (2009) The role of light in soybean seed fillingmetabolism. Plant Journal 58: 220-234. 82. Jazmin L J, Young J D(2013) Isotopically nonstationary 13C metabolic flux analysis.Methods Mol Biol 985: 367-390. 83. Blankenship R E, Tiede D M,Barber J, Brudvig G W, Fleming G, et al. (2011) Comparingphotosynthetic and photovoltaic efficiencies and recognizing thepotential for improvement. Science 332: 805-809. 84. Jazmin L J,O'Grady J, Ma F, Allen D K, Morgan J A, et al. (In press)Isotopically nonstationary MFA (INST-MFA) of autotrophicmetabolism. Methods Mol Biol. 85. Egnatchik R A, Leamy A K, NoguchiY, Shiota M, Young J D (In press) Palmitate-induced activation ofmitochondrial metabolism promotes oxidative stress and apoptosis inH411EC3 rat hepatocytes. Metabolism. 86. Leamy A K, Egnatchik R A,Young J D (2013) Molecular mechanisms and the role of saturatedfatty acids in the progression of non-alcoholic fatty liverdisease. Prog Lipid Res 52: 165-174. 87. Srour O, Young J D, EldarY C (2011) Fluxomers: a new approach for 13C metabolic fluxanalysis. BMC Syst Biol 5: 129. 88. Young J D, Allen D K, Morgan JA (2014) Isotopomer measurement techniques in metabolic fluxanalysis II: Mass spectrometry. Methods Mol Biol 1083: 85-108. 89.Egnatchik R A, Leamy A K, Jacobson D A, Young J D (Submitted) E Rcalcium stimulates mitochondrial alterations in hepaticlipotoxicity. J Biol Chem. 90. Leamy A K, Egnatchik R A, Shiota M,Young J D (Submitted) Modulating lipid fate controls E R stress andlipotoxicity in palmitate-treated hepatic cells. FEBS J. 91. YoungJ D (In press) Metabolic flux rewiring in mammalian cell cultures.Curr Opin Biotechnol. 92. McAtee A G, Templeton N, Young J D(Submitted) Role of CHO central carbon metabolism in controllingthe quality of secreted biotherapeutic proteins. PharmaceuticalBioprocessing. 93. Duckwall C S, Murphy T A, Young J D (2013)Mapping cancer cell metabolism with (13)C flux analysis: Recentprogress and future challenges. J Carcinog 12: 13.
SEQUENCE LISTINGS
1
1081132PRTArabidopsis thalianaMISC_FEATURE(1)..(44)PGR5 Amino acidsequence with chloroplast transit peptide 1Met Ala Ala Ala Ser IleSer Ala Ile Gly Cys Asn Gln Thr Leu Ile1 5 10 15Gly Thr Ser Phe TyrGly Gly Trp Gly Ser Ser Ile Ser Gly Glu Asp 20 25 30Tyr Gln Thr MetLeu Ser Lys Thr Val Ala Pro Pro Gln Gln Ala Arg 35 40 45Val Ser ArgLys Ala Ile Arg Ala Val Pro Met Met Lys Asn Val Asn 50 55 60Glu GlyLys Gly Leu Phe Ala Pro Leu Val Val Val Thr Arg Asn Leu65 70 7580Val Gly Lys Lys Arg Phe Asn Gln Leu Arg Gly Lys Ala Ile Ala Leu85 90 95His Ser Gln Val Ile Thr Glu Phe Cys Lys Ser Ile Gly Ala AspAla 100 105 110Lys Gln Arg Gln Gly Leu Ile Arg Leu Ala Lys Lys AsnGly Glu Arg 115 120 125Leu Gly Phe Leu 1302402DNAArabidopsisthaliana 2atggctgctg cttcgatttc tgcaatagga tgtaatcaaa ctttgataggaacttccttc 60tatggaggat ggggaagttc catctccgga gaagattacc aaaccatgctctccaagaca 120gttgcgccac cgcaacaagc cagagtctca aggaaagcaatcagagcagt tccaatgatg 180aagaatgtca atgaaggcaa aggcttatttgcacctctag ttgttgtcac acgcaaccta 240gtaggcaaga agaggtttaatcagctcaga ggaaaagcca ttgccttaca ctctcaggtg 300atcactgagttttgcaaatc gattggagca gatgcaaaac agagacaagg gcttatcagg360cttgctaaga agaatggaga gaggcttggt ttccttgctt ag4023324PRTArabidopsis thaliana 3Met Gly Ser Lys Met Leu Phe Ser LeuThr Ser Pro Arg Leu Phe Ser1 5 10 15Ala Val Ser Arg Lys Pro Ser SerSer Phe Ser Pro Ser Pro Pro Ser 20 25 30Pro Ser Ser Arg Thr Gln TrpThr Gln Leu Ser Pro Gly Lys Ser Ile 35 40 45Ser Leu Arg Arg Arg ValPhe Leu Leu Pro Ala Lys Ala Thr Thr Glu 50 55 60Gln Ser Gly Pro ValGly Gly Asp Asn Val Asp Ser Asn Val Leu Pro65 70 75 80Tyr Cys SerIle Asn Lys Ala Glu Lys Lys Thr Ile Gly Glu Met Glu 85 90 95Gln GluPhe Leu Gln Ala Leu Gln Ser Phe Tyr Tyr Asp Gly Lys Ala 100 105110Ile Met Ser Asn Glu Glu Phe Asp Asn Leu Lys Glu Glu Leu Met Trp115 120 125Glu Gly Ser Ser Val Val Met Leu Ser Ser Asp Glu Gln ArgPhe Leu 130 135 140Glu Ala Ser Met Ala Tyr Val Ser Gly Asn Pro IleLeu Asn Asp Glu145 150 155 160Glu Tyr Asp Lys Leu Lys Leu Lys LeuLys Ile Asp Gly Ser Asp Ile 165 170 175Val Ser Glu Gly Pro Arg CysSer Leu Arg Ser Lys Lys Val Tyr Ser 180 185 190Asp Leu Ala Val AspTyr Phe Lys Met Leu Leu Leu Asn Val Pro Ala 195 200 205Thr Val ValAla Leu Gly Leu Phe Phe Phe Leu Asp Asp Ile Thr Gly 210 215 220PheGlu Ile Thr Tyr Ile Met Glu Leu Pro Glu Pro Tyr Ser Phe Ile225 230235 240Phe Thr Trp Phe Ala Ala Val Pro Val Ile Val Tyr Leu Ala LeuSer 245 250 255Ile Thr Lys Leu Ile Ile Lys Asp Phe Leu Ile Leu LysGly Pro Cys 260 265 270Pro Asn Cys Gly Thr Glu Asn Thr Ser Phe PheGly Thr Ile Leu Ser 275 280 285Ile Ser Ser Gly Gly Lys Thr Asn ThrVal Lys Cys Thr Asn Cys Gly 290 295 300Thr Ala Met Val Tyr Asp SerGly Ser Arg Leu Ile Thr Leu Pro Glu305 310 315 320Gly Ser GlnAla4975DNANeisseria gonorrhoeaemisc_featurecodon optimized forArabidopsis thaliana 4atgggtagca agatgttgtt tagtttgaca agtcctcgacttttctccgc cgtttctcgc 60aaaccttcct cttctttctc tccttctcct ccgtcgccgtcttcgaggac tcaatggact 120cagctcagcc ctggaaaatc gatttctttgagaagaagag tcttcttgtt gcctgctaaa 180gccacaacag agcaatcaggtccagtagga ggagacaacg tcgatagcaa tgttttgccc 240tattgtagcatcaacaaggc tgagaagaaa acaattggtg aaatggaaca agagtttctc300caagcgttgc aatctttcta ttatgatggc aaagcgatca tgtctaatgaagagtttgat 360aaccttaaag aagagttaat gtgggaagga agcagtgttgtgatgctaag ttccgatgaa 420caaagattct tggaagcttc catggcttatgtttctggaa atccaatctt gaatgatgaa 480gaatatgata agctcaaactcaaactaaag attgatggta gcgacattgt gagcgagggt 540ccaagatgcagtctccgtag taaaaaggtg tatagtgatc tcgctgtaga ttatttcaaa600atgttattgt tgaatgttcc agcaaccgtt gttgctctcg gactctttttcttcctggac 660gacattacag gttttgagat cacatacatc atggagcttccagaaccata cagtttcata 720ttcacttggt tcgctgctgt gcctgtgattgtatatctgg ctttatcaat caccaaattg 780atcatcaagg acttcttgatcttgaagggt ccttgtccga attgtggaac ggaaaacacc 840tccttctttggaacaattct gtcaatctcc agcggcggca aaaccaacac tgtcaaatgc900accaactgcg gaaccgcgat ggtgtatgac tcgggttcta ggttgatcacattgccagaa 960ggaagccaag cttaa 9755280PRTNeisseriagonorrhoeaeMISC_FEATURE(1)..(54)Bacterial carbonic anhydrase (BCA)amino acid sequence with rbcs-1a transit peptide 5Met Ala Ser SerMet Leu Ser Ser Ala Thr Met Val Ala Ser Pro Ala1 5 10 15Gln Ala ThrMet Val Ala Pro Phe Asn Gly Leu Lys Ser Ser Ala Ala 20 25 30Phe ProAla Thr Arg Lys Ala Asn Asn Asp Ile Thr Ser Ile Thr Ser 35 40 45AsnGly Gly Arg Val Asn His Gly Asn His Thr His Trp Gly Tyr Thr 50 5560Gly His Asp Ser Pro Glu Ser Trp Gly Asn Leu Ser Glu Glu Phe Arg6570 75 80Leu Cys Ser Thr Gly Lys Asn Gln Ser Pro Val Asn Ile Thr GluThr 85 90 95Val Ser Gly Lys Leu Pro Ala Ile Lys Val Asn Tyr Lys ProSer Met 100 105 110Val Asp Val Glu Asn Asn Gly His Thr Ile Gln ValAsn Tyr Pro Glu 115 120 125Gly Gly Asn Thr Leu Thr Val Asn Gly ArgThr Tyr Thr Leu Lys Gln 130 135 140Phe His Phe His Val Pro Ser GluAsn Gln Ile Lys Gly Arg Thr Phe145 150 155 160Pro Met Glu Ala HisPhe Val His Leu Asp Glu Asn Lys Gln Pro Leu 165 170 175Val Leu AlaVal Leu Tyr Glu Ala Gly Lys Thr Asn Gly Arg Leu Ser 180 185 190SerIle Trp Asn Val Met Pro Met Thr Ala Gly Lys Val Lys Leu Asn 195 200205Gln Pro Phe Asp Ala Ser Thr Leu Leu Pro Lys Arg Leu Lys Tyr Tyr210 215 220Arg Phe Ala Gly Ser Leu Thr Thr Pro Pro Cys Thr Glu GlyVal Ser225 230 235 240Trp Leu Val Leu Lys Thr Tyr Asp His Ile AspGln Ala Gln Ala Glu 245 250 255Lys Phe Thr Arg Ala Val Gly Ser GluAsn Asn Arg Pro Val Gln Pro 260 265 270Leu Asn Ala Arg Val Val IleGlu 275 2806684DNANeisseria gonorrhoeae 6aaccacggca atcacacccattggggctat accggacacg actctcccga aagctggggc 60aatctgtcag aagaattccgtttgtgctcc accggcaaaa accaatctcc ggtaaacatt 120accgaaaccgtttccggcaa actgcccgcc atcaaagtca attacaaacc gagtatggtt180gacgtggaaa acaacggcca caccattcag gtcaattatc ccgaaggcggcaataccctg 240accgtgaacg gcagaaccta taccctgaaa cagttccacttccacgtgcc gagcgaaaac 300caaatcaaag gcagaacttt cccgatggaagctcacttcg tccacttaga cgaaaacaaa 360cagcctttag tattagccgtgctgtatgaa gccggcaaaa ccaacgggag actgtcttcc 420atctggaacgtcatgccgat gaccgcagga aaagtgaaac tcaaccaacc gttcgacgca480tccaccctac tgccgaaaag attgaaatac tacagatttg ccggttcgctgaccacgccg 540ccgtgcacag agggcgtatc atggttggtg ttgaaaacttatgaccacat cgaccaagcg 600caagcggaaa aattcaccag agccgtcggttcggaaaaca acagacccgt acagcctctg 660aatgcacgtg tagttattga ataa6847356DNAArabidopsis thaliana 7ggtttacatt gatgctctca ggatttcataaggatagaga gatctattcg tatacgtgtc 60acgtcatgag tgggtgtttc gccaatccatgaaacgcacc tagatatcta aaacacatat 120caattgcgaa tctgcgaagtgcgagccatt aaccacgtaa gcaaacaaac aatctaaacc 180ccaaaaaaaatctatgacta gccaatagca acctcagaga ttgatatttc aagataagac240agtatttaga tttctgtatt atatatagcg aaaatcgcat caataccaaaccacccattt 300cttggcttac aacaacaaat cttaaacgtt ttactttgtgctgcactact caacct 3568162DNAArabidopsis thalianamisc_featureRbcs-1atransit peptide 8atggcttcct ctatgctctc ttccgctact atggttgcctctccggctca ggccactatg 60gtcgctcctt tcaacggact taagtcctcc gctgccttcccagccaccag aaaggctaac 120aacgacatta cttccatcac aagcaacggcggaagagtta ac 1629207DNAArtificial SequenceNOS terminator fromcloning vector.misc_featureNos terminator 9atcgttcaaa catttggcaataaagtttct taagattgaa tcctgttgcc ggtcttgcga 60tgattatcat ataatttctgttgaattacg ttaagcatgt aataattaac atgtaatgca 120tgacgttatttatgagatgg gtttttatga ttagagtccc gcaattatac atttaatacg180cgatagaaaa caaaatatag cgcgcaa 20710251DNANicotiana sylvestris10tgtggtcaca cctcaaacta aatcaaccag tttgcatttt tttccttctc aatgttaatt60tgctgacttg gctagggtgc gaatcaaatc acacgttcta attgggcaaa atccgtatat120caccttatcc tatatccttt ttctccacca cccatcatct cttctatgcaacaaaaatag 180cttcttcctt ttcatttttc acttctctca atccaacttttctatggcca tggcatccca 240agcttccctt t 25111255DNAArabidopsisthaliana 11tatacaaagc aaccgatcaa gtggagacta gtaaaccata cacaatcactcatttcctca 60caaaagaaag ataagataag ggtgtcaaca cctttcctta atcatgtggtagtgaacgag 120ttatcatgaa tcccggaccc tttgatcatt agggctttttgcctcttacg gttctcacta 180tataaagatg acaaaaccaa tagaaaaacaattaagcaaa agaagaagaa gaagaagtaa 240tggcttcctc tatgc255123249DNAChlamydomonas reinhardtii 12atgcttcctg gtcttggtgtcatccttctt gtgcttccta tgcagtacta cttcggttac 60aagatcgtgc agatcaagcttcagaacgct aagcacgtcg ctcttcgttc tgctatcatg 120caggaggtgcttcctgctat caagcttgtc aagtactacg cttgggagca gttctttgag180aaccagatct ctaaggtccg tcgtgaggag atccgtctca acttctggaactgcgtgatg 240aaggtcatca acgtggcttg cgtgttctgc gtgccgcctatgaccgcttt cgtcatcttc 300accacctacg agttccagcg tgctcgtcttgtgtcttctg tcgctttcac caccctttct 360cttttcaaca ttcttcgtttccctcttgtc gtgcttccta aggctcttcg tgctgtgtct 420gaggctaacgcttctctcca gcgtcttgag gcttaccttc ttgaggaggt gccttctggt480actgctgctg tcaagacccc taagaacgct cctcctggtg ctgtcatcgagaacggtgtg 540ttccaccacc cttctaaccc taactggcac cttcacgtgcctaagttcga ggtcaagcct 600ggtcaggtcg ttgctgtggt gggtcgtatcgctgctggta agtcttctct tgtgcaggct 660atcctcggta acatggtcaaggagcacggt tctttcaacg tgggtggtcg tatctcttac 720gtgccgcagaacccttggct tcagaacctt tctcttcgtg acaacgtgct ttttggtgag780cagttcgatg agaacaagta caccgacgtc atcgagtctt gcgctcttacccttgacctt 840cagatccttt ctaacggtga ccagtctaag gctggtatccgtggtgtcaa cttctctggt 900ggtcagcgtc agcgtgtgaa ccttgctcgttgcgcttacg ctgacgctga ccttgtgctt 960ctcgacaacg ctctttctgctgtggaccac cacaccgctc accacatctt cgacaagtgc 1020atcaagggtcttttctctga caaggctgtg gtgcttgtca cccaccagat cgagttcatg1080cctcgttgcg acaacgtggc tatcatggac gagggtcgtt gcctttacttcggtaagtgg 1140aacgaggagg ctcagcacct tctcggtaag cttcttcctatcacccacct tcttcacgct 1200gctggttctc aggaggctcc tcctgctcctaagaagaagg ctgaggacaa ggctggtcct 1260cagaagtctc agtctcttcagcttaccctt gctcctacct ctatcggtaa gcctaccgag 1320aagcctaaggacgtccagaa gcttactgct taccaggctg ctctcatcta cacctggtac1380ggtaaccttt tccttgttgg tgtgtgcttc ttcttcttcc ttgctgctcagtgctctcgt 1440cagatctctg atttctgggt gcgttggtgg gtgaacgacgagtacaagaa gttccctgtg 1500aagggtgagc aggactctgc tgctaccaccttctactgcc tcatctacct tcttcttgtg 1560ggtcttttct acatcttcatgatcttccgt ggtgctactt tcctttggtg ggtgctcaag 1620tcttctgagaccatccgtag gaaggctctt cacaacgtcc tcaacgctcc tatgggtttc1680ttccttgtca cgccggtcgg tgaccttctt ctcaacttca ccaaggaccaggacattatg 1740gatgagaacc ttcctgatgc tgttcacttc atgggtatctacggtcttat tcttcttgct 1800accaccatca ccgtgtctgt caccatcaacttcttcgctg ctttcaccgg tgctcttatc 1860atcatgaccc tcatcatgctctctatctac cttcctgctg ctactgctct taagaaggct 1920cgtgctgtgtctggtggtat gcttgtcggt cttgttgctg aggttcttga gggtcttggt1980gtggttcagg ctttcaacaa gcaggagtac ttcattgagg aggctgctcgtcgtaccaac 2040atcaccaact ctgctgtctt caacgctgag gctcttaacctttggcttgc tttctggtgc 2100gacttcatcg gtgcttgcct tgtgggtgtggtgtctgctt tcgctgtggg tatggctaag 2160gaccttggtg gtgctaccgtcggtcttgct ttctctaaca tcattcagat gcttgtgttc 2220tacacctgggtggtccgttt catctctgag tctatctctc tcttcaactc tgtcgagggt2280atggcttacc tcgctgacta cgtgcctcac gatggtgtct tctatgaccagcgtcagaag 2340gacggtgtcg ctaagcaaat cgtccttcct gacggtaacatcgtgcctgc tgcttctaag 2400gtccaggtcg tggttgacga cgctgctctcgctcgttggc ctgctaccgg taacatccgt 2460ttcgaggacg tgtggatgcagtaccgtctt gacgctcctt gggctcttaa gggtgtcacc 2520ttcaagatcaacgacggtga gaaggtcggt gctgtgggtc gtaccggttc tggtaagtct2580accacgcttc ttgctcttta ccgtatgttc gagcttggta agggtcgtatccttgtcgac 2640ggtgtggaca tcgctaccct ttctctcaag cgtcttcgtaccggtctttc tatcattcct 2700caggagcctg tcatgttcac cggtaccgtgcgttctaacc ttgacccttt cggtgagttc 2760aaggacgatg ctattctttgggaggtgctt aagaaggtcg gtctcgagga ccaggctcag 2820cacgctggtggtcttgacgg tcaggtcgat ggtaccggtg gtaaggcttg gtctcttggt2880cagatgcagc ttgtgtgcct tgctcgtgct gctcttcgtg ctgtgcctatcctttgcctt 2940gacgaggcta ccgctgctat ggacccgcac actgaggctatcgtgcagca gaccatcaag 3000aaggtgttcg acgaccgtac caccatcaccattgctcacc gtcttgacac catcatcgag 3060tctgacaaga tcatcgtgatggagcagggt tctcttatgg agtacgagtc tccttctaag 3120cttctcgctaaccgtgactc tatgttctct aagcttgtcg acaagaccgg tcctgctgct3180gctgctgctc ttcgtaagat ggctgaggac ttctggtcta ctcgttctgctcagggtcgt 3240aaccagtaa 32491335PRTArabidopsis thaliana 13Met AlaAla Ser Thr Met Ala Leu Ser Ser Pro Ala Phe Ala Gly Lys1 5 10 15AlaVal Asn Leu Ser Pro Ala Ala Ser Glu Val Leu Gly Ser Gly Arg 20 2530Val Thr Met 351444PRTArabidopsis thalianaMISC_FEATUREPGR5 transitpeptide 14Met Ala Ala Ala Ser Ile Ser Ala Ile Gly Cys Asn Gln ThrLeu Ile1 5 10 15Gly Thr Ser Phe Tyr Gly Gly Trp Gly Ser Ser Ile SerGly Glu Asp 20 25 30Tyr Gln Thr Met Leu Ser Lys Thr Val Ala Pro Pro35 401545PRTArabidopsis thalianaMISC_FEATUREpsaD transit eptide15Met Ala Thr Gln Ala Ala Gly Ile Phe Asn Ser Ala Ile Thr Thr Ala15 10 15Ala Thr Ser Gly Val Lys Lys Leu His Phe Phe Ser Thr Thr HisArg 20 25 30Pro Lys Ser Leu Ser Phe Thr Lys Thr Ala Ile Arg Ala 3540 45161011DNAChlamydomonas reinhardtii 16atgcagacca ctatgactcgcccttgcctt gcccagcccg tgctgcgatc tcgtgtgctc 60cggtcgccta tgcgggtggttgcagcgagc gctcctaccg cggtgacgac agtcgtgacc 120tcgaatggaaatggcaacgg tcatttccaa gctgctacta cgcccgtgcc ccctactccc180gctcccgtcg ctgtttccgc gcctgtgcgc gctgtgtcgg tgctgactcctcctcaagtg 240tatgagaacg ccattaatgt tggcgcctac aaggccgggctaacgcctct ggcaacgttt 300gtccagggca tccaagccgg tgcctacattgcgttcggcg ccttcctcgc catctccgtg 360ggaggcaaca tccccggcgtcgccgccgcc aaccccggcc tggccaagct gctatttgct 420ctggtgttccccgtgggtct gtccatggtg accaactgcg gcgccgagct gttcacgggc480aacaccatga tgctcacatg cgcgctcatc gagaagaagg ccacttgggggcagcttctg 540aagaactgga gcgtgtccta cttcggcaac ttcgtgggctccatcgccat ggtcgccgcc 600gtggtggcca ccggctgcct gaccaccaacaccctgcctg tgcagatggc caccctcaag 660gccaacctgg gcttcaccgaggtgctgtcg cgctccatcc tgtgcaactg gctggtgtgc 720tgcgccgtgtggtccgcctc cgccgccacc tcgctgcccg gccgcatcct ggcgctgtgg780ccctgcatca ccgccttcgt ggccatcggc ctggagcact ccgtcgccaacatgttcgtg 840attcctctgg gcatgatgct gggcgctgag gtcacgtggagccagttctt tttcaacaac 900ctgatccccg tcaccctggg caacaccattgctggcgttc tcatgatggc catcgcctac 960tccatctcgt tcggctccctcggcaagtcc gccaagcccg ccaccgcgta a 101117892DNAHomo sapiens17atgatatcct cttcagctgt gactacagtc agccgtgctt ctacggtgca atcggccgcg60gtggctccat tcggcggcct caaatccatg actggattcc cagttaagaa ggtcaacact120gacattactt ccattacaag caatggtgga agagtaaagt gcatgcaggtggagctctct 180catcattggg gttatggtaa acacaatggt cctgaacactggcataaaga ctttccaatt 240gcaaaaggtg aacgtcaatc acctgttgatattgacactc atacagctaa atatgaccct 300tctttaaaac cattatctgtttcatatgat caagcaactt ctttacgtat tttaaacaat 360ggtcatgcttttaatgtaga atttgatgac tctcaagata aagcagtatt aaaaggtggt420ccattagatg gtacttaccg tttaattcaa tttcactttc actggggttcattagatggt 480caaggttcag aacatactgt agataaaaaa aaatatgctgcagaattaca cttagttcac 540tggaacacaa aatatggtga ttttggtaaagctgtacaac aacctgatgg tttagctgtt 600ttaggtattt ttttaaaagttggtagtgct aaaccaggtc ttcaaaaagt tgttgatgta 660ttagattcaattaaaacaaa aggtaaaagt gctgacttta ctaatttcga tcctcgtggt720ttacttcctg aatctttaga ttactggaca tatccaggtt cattaacaacacctcctctt 780ttagaatgtg taacatggat tgtattaaaa gaaccaattagtgtaagtag tgaacaagta 840ttaaaattcc gtaaacttaa tttcaatggtgaaggtgaac cagaagaatt aa 89218336PRTChlamydomonas reinhardtii 18MetGln Thr Thr Met Thr Arg Pro Cys Leu Ala Gln Pro Val Leu Arg1 5 1015Ser Arg Val Leu Arg Ser Pro Met Arg Val Val Ala Ala Ser Ala Pro20 25
30Thr Ala Val Thr Thr Val Val Thr Ser Asn Gly Asn Gly Asn Gly His35 40 45Phe Gln Ala Ala Thr Thr Pro Val Pro Pro Thr Pro Ala Pro ValAla 50 55 60Val Ser Ala Pro Val Arg Ala Val Ser Val Leu Thr Pro ProGln Val65 70 75 80Tyr Glu Asn Ala Ile Asn Val Gly Ala Tyr Lys AlaGly Leu Thr Pro 85 90 95Leu Ala Thr Phe Val Gln Gly Ile Gln Ala GlyAla Tyr Ile Ala Phe 100 105 110Gly Ala Phe Leu Ala Ile Ser Val GlyGly Asn Ile Pro Gly Val Ala 115 120 125Ala Ala Asn Pro Gly Leu AlaLys Leu Leu Phe Ala Leu Val Phe Pro 130 135 140Val Gly Leu Ser MetVal Thr Asn Cys Gly Ala Glu Leu Phe Thr Gly145 150 155 160Asn ThrMet Met Leu Thr Cys Ala Leu Ile Glu Lys Lys Ala Thr Trp 165 170175Gly Gln Leu Leu Lys Asn Trp Ser Val Ser Tyr Phe Gly Asn Phe Val180 185 190Gly Ser Ile Ala Met Val Ala Ala Val Val Ala Thr Gly CysLeu Thr 195 200 205Thr Asn Thr Leu Pro Val Gln Met Ala Thr Leu LysAla Asn Leu Gly 210 215 220Phe Thr Glu Val Leu Ser Arg Ser Ile LeuCys Asn Trp Leu Val Cys225 230 235 240Cys Ala Val Trp Ser Ala SerAla Ala Thr Ser Leu Pro Gly Arg Ile 245 250 255Leu Ala Leu Trp ProCys Ile Thr Ala Phe Val Ala Ile Gly Leu Glu 260 265 270His Ser ValAla Asn Met Phe Val Ile Pro Leu Gly Met Met Leu Gly 275 280 285AlaGlu Val Thr Trp Ser Gln Phe Phe Phe Asn Asn Leu Ile Pro Val 290 295300Thr Leu Gly Asn Thr Ile Ala Gly Val Leu Met Met Ala Ile AlaTyr305 310 315 320Ser Ile Ser Phe Gly Ser Leu Gly Lys Ser Ala LysPro Ala Thr Ala 325 330 33519260PRTHomo sapiens 19Met Ser His HisTrp Gly Tyr Gly Lys His Asn Gly Pro Glu His Trp1 5 10 15His Lys AspPhe Pro Ile Ala Lys Gly Glu Arg Gln Ser Pro Val Asp 20 25 30Ile AspThr His Thr Ala Lys Tyr Asp Pro Ser Leu Lys Pro Leu Ser 35 40 45ValSer Tyr Asp Gln Ala Thr Ser Leu Arg Ile Leu Asn Asn Gly His 50 5560Ala Phe Asn Val Glu Phe Asp Asp Ser Gln Asp Lys Ala Val Leu Lys6570 75 80Gly Gly Pro Leu Asp Gly Thr Tyr Arg Leu Ile Gln Phe His PheHis 85 90 95Trp Gly Ser Leu Asp Gly Gln Gly Ser Glu His Thr Val AspLys Lys 100 105 110Lys Tyr Ala Ala Glu Leu His Leu Val His Trp AsnThr Lys Tyr Gly 115 120 125Asp Phe Gly Lys Ala Val Gln Gln Pro AspGly Leu Ala Val Leu Gly 130 135 140Ile Phe Leu Lys Val Gly Ser AlaLys Pro Gly Leu Gln Lys Val Val145 150 155 160Asp Val Leu Asp SerIle Lys Thr Lys Gly Lys Ser Ala Asp Phe Thr 165 170 175Asn Phe AspPro Arg Gly Leu Leu Pro Glu Ser Leu Asp Tyr Trp Thr 180 185 190TyrPro Gly Ser Leu Thr Thr Pro Pro Leu Leu Glu Cys Val Thr Trp 195 200205Ile Val Leu Lys Glu Pro Ile Ser Val Ser Ser Glu Gln Val Leu Lys210 215 220Phe Arg Lys Leu Asn Phe Asn Gly Glu Gly Glu Pro Glu GluLeu Met225 230 235 240Val Asp Asn Trp Arg Pro Ala Gln Pro Leu LysAsn Arg Gln Ile Lys 245 250 255Ala Ser Phe Lys 26020159PRTHomosapiens 20Met Val Met Leu Ser Thr Trp Ser Leu Met Thr Leu Arg ThrLys Gln1 5 10 15Leu His Leu Val His Trp Asn Thr Lys Tyr Gly Asp PheGly Lys Ala 20 25 30Val Gln Gln Pro Asp Gly Leu Ala Val Leu Gly IlePhe Leu Lys Val 35 40 45Gly Ser Ala Lys Pro Gly Leu Gln Lys Val ValAsp Val Leu Asp Ser 50 55 60Ile Lys Thr Lys Gly Lys Ser Ala Asp PheThr Asn Phe Asp Pro Arg65 70 75 80Gly Leu Leu Pro Glu Ser Leu AspTyr Trp Thr Tyr Pro Gly Ser Leu 85 90 95Thr Thr Pro Pro Leu Leu GluCys Val Thr Trp Ile Val Leu Lys Glu 100 105 110Pro Ile Ser Val SerSer Glu Gln Val Leu Lys Phe Arg Lys Leu Asn 115 120 125Phe Asn GlyGlu Gly Glu Pro Glu Glu Leu Met Val Asp Asn Trp Arg 130 135 140ProAla Gln Pro Leu Lys Asn Arg Gln Ile Lys Ala Ser Phe Lys145 15015521252PRTNeisseria gonorrhoeae 21Met Pro Arg Phe Pro Arg Thr LeuPro Arg Leu Thr Ala Val Leu Leu1 5 10 15Leu Ala Cys Thr Ala Phe SerAla Ala Ala His Gly Asn His Thr His 20 25 30Trp Gly Tyr Thr Gly HisAsp Ser Pro Glu Ser Trp Gly Asn Leu Ser 35 40 45Glu Glu Phe Arg LeuCys Ser Thr Gly Lys Asn Gln Ser Pro Val Asn 50 55 60Ile Thr Glu ThrVal Ser Gly Lys Leu Pro Ala Ile Lys Val Asn Tyr65 70 75 80Lys ProSer Met Val Asp Val Glu Asn Asn Gly His Thr Ile Gln Val 85 90 95AsnTyr Pro Glu Gly Gly Asn Thr Leu Thr Val Asn Gly Arg Thr Tyr 100 105110Thr Leu Lys Gln Phe His Phe His Val Pro Ser Glu Asn Gln Ile Lys115 120 125Gly Arg Thr Phe Pro Met Glu Ala His Phe Val His Leu AspGlu Asn 130 135 140Lys Gln Pro Leu Val Leu Ala Val Leu Tyr Glu AlaGly Lys Thr Asn145 150 155 160Gly Arg Leu Ser Ser Ile Trp Asn ValMet Pro Met Thr Ala Gly Lys 165 170 175Val Lys Leu Asn Gln Pro PheAsp Ala Ser Thr Leu Leu Pro Lys Arg 180 185 190Leu Lys Tyr Tyr ArgPhe Ala Gly Ser Leu Thr Thr Pro Pro Cys Thr 195 200 205Glu Gly ValSer Trp Leu Val Leu Lys Thr Tyr Asp His Ile Asp Gln 210 215 220AlaGln Ala Glu Lys Phe Thr Arg Ala Val Gly Ser Glu Asn Asn Arg225 230235 240Pro Val Gln Pro Leu Asn Ala Arg Val Val Ile Glu 24525022217PRTArabidopsis thalianaMISC_FEATUREDNAJ transit peptide22Met Ala Ser Leu Ser Thr Ile Thr Gln Pro Ser Leu Val His Ile Pro15 10 15Gly Glu Ser Val Leu His His Val Pro Ser Thr Cys Ser Phe ProTrp 20 25 30Lys Pro Thr Ile Asn Thr Lys Arg Ile Ile Cys Ser Pro AlaArg Asn 35 40 45Ser Ser Glu Val Ser Ala Glu Ala Glu Thr Glu Gly GlySer Ser Thr 50 55 60Ala Val Asp Glu Ala Pro Lys Glu Ser Pro Ser LeuIle Ser Ala Leu65 70 75 80Asn Val Glu Arg Ala Leu Arg Gly Leu ProIle Thr Asp Val Asp His 85 90 95Tyr Gly Arg Leu Gly Ile Phe Arg AsnCys Ser Tyr Asp Gln Val Thr 100 105 110Ile Gly Tyr Lys Glu Arg ValLys Glu Leu Lys Glu Gln Gly Leu Asp 115 120 125Glu Glu Gln Leu LysThr Lys Met Asp Leu Ile Lys Ser Tyr Thr Ile 130 135 140Leu Ser ThrVal Glu Glu Arg Arg Met Tyr Asp Trp Ser Leu Ala Arg145 150 155160Ser Glu Lys Ala Glu Arg Tyr Val Trp Pro Phe Glu Val Asp Ile Met165 170 175Glu Pro Ser Arg Glu Glu Pro Pro Pro Gln Glu Pro Glu AspVal Gly 180 185 190Pro Thr Arg Ile Leu Gly Tyr Phe Ile Gly Ala TrpLeu Val Leu Gly 195 200 205Val Ala Leu Ser Val Ala Phe Asn Arg 2102152367PRTArabidopsis thaliana 23Met Asp Lys Ala Leu Thr Gly IleSer Ala Ala Ala Leu Thr Ala Ser1 5 10 15Met Val Ile Pro Glu Ile AlaGlu Ala Ala Gly Ser Gly Ile Ser Pro 20 25 30Ser Leu Lys Asn Phe LeuLeu Ser Ile Ala Ser Gly Gly Leu Val Leu 35 40 45Thr Val Ile Ile GlyVal Val Val Gly Val Ser Asn Phe Asp Pro Val 50 55 60Lys ArgThr6524260PRTMacaca fascicularis 24Met Ser His His Trp Gly Tyr GlyLys His Asn Gly Pro Glu His Trp1 5 10 15His Lys Asp Phe Pro Ile AlaLys Gly Gln Arg Gln Ser Pro Val Asp 20 25 30Ile Asp Thr His Thr AlaLys Tyr Asp Pro Ser Leu Lys Pro Leu Ser 35 40 45Val Ser Tyr Asp GlnAla Thr Ser Leu Arg Ile Leu Asn Asn Gly His 50 55 60Ser Phe Asn ValGlu Phe Asp Asp Ser Gln Asp Lys Ala Val Ile Lys65 70 75 80Gly GlyPro Leu Asp Gly Thr Tyr Arg Leu Ile Gln Phe His Phe His 85 90 95TrpGly Ser Leu Asp Gly Gln Gly Ser Glu His Thr Val Asp Lys Lys 100 105110Lys Tyr Ala Ala Glu Leu His Leu Val His Trp Asn Thr Lys Tyr Gly115 120 125Asp Phe Gly Lys Ala Val Gln Gln Pro Asp Gly Leu Ala ValLeu Gly 130 135 140Ile Phe Leu Lys Val Gly Ser Ala Lys Pro Gly LeuGln Lys Val Val145 150 155 160Asp Val Leu Asp Ser Ile Lys Thr LysGly Lys Ser Ala Asp Phe Thr 165 170 175Asn Phe Asp Pro Arg Gly LeuLeu Pro Glu Ser Leu Asp Tyr Trp Thr 180 185 190Tyr Pro Gly Ser LeuThr Thr Pro Pro Leu Leu Glu Cys Val Thr Trp 195 200 205Ile Val LeuLys Glu Pro Ile Ser Val Ser Ser Glu Gln Met Ser Lys 210 215 220PheArg Lys Leu Asn Phe Asn Gly Glu Gly Glu Pro Glu Glu Leu Met225 230235 240Val Asp Asn Trp Arg Pro Ala Gln Pro Leu Lys Asn Arg Gln IleLys 245 250 255Ala Ser Phe Lys 26025260PRTPan troglodytes 25Met SerHis His Trp Gly Tyr Gly Lys His Asn Gly Pro Glu His Trp1 5 10 15HisLys Asp Phe Pro Ile Ala Lys Gly Glu Arg Gln Ser Pro Val Asp 20 2530Ile Asp Thr His Thr Ala Lys Tyr Asp Pro Ser Leu Lys Pro Leu Ser35 40 45Val Ser Tyr Gly Gln Ala Thr Ser Leu Arg Ile Leu Asn Asn GlyHis 50 55 60Ala Phe Asn Val Glu Phe Asp Asp Ser Gln Asp Lys Ala ValLeu Lys65 70 75 80Gly Gly Pro Leu Asp Gly Thr Tyr Arg Leu Ile GlnPhe His Phe His 85 90 95Trp Gly Ser Leu Asp Gly Gln Gly Ser Glu HisThr Val Asp Lys Lys 100 105 110Lys Tyr Ala Ala Glu Leu His Leu ValHis Trp Asn Thr Lys Tyr Gly 115 120 125Asp Phe Gly Lys Ala Val GlnGln Pro Asp Gly Leu Ala Val Leu Gly 130 135 140Ile Phe Leu Lys ValGly Ser Ala Lys Pro Gly Leu Gln Lys Val Val145 150 155 160Asp ValLeu Asp Ser Ile Lys Thr Lys Gly Lys Ser Ala Asp Phe Thr 165 170175Asn Phe Asp Pro His Gly Leu Leu Pro Glu Ser Leu Asp Tyr Trp Thr180 185 190Tyr Pro Gly Ser Leu Thr Thr Pro Pro Leu Leu Glu Cys ValThr Trp 195 200 205Ile Val Leu Lys Glu Pro Ile Ser Val Ser Ser GluGln Met Leu Lys 210 215 220Phe Arg Lys Leu Asn Phe Asn Gly Glu GlyGlu Pro Glu Glu Leu Met225 230 235 240Val Asp Asn Trp Arg Pro AlaGln Pro Leu Lys Asn Arg Gln Ile Lys 245 250 255Ala Ser Phe Lys26026260PRTPongo abelii 26Met Ser His His Trp Gly Tyr Gly Lys HisAsn Gly Pro Glu His Trp1 5 10 15His Lys Asp Phe Pro Ile Ala Lys GlyGlu Arg Gln Ser Pro Val Asp 20 25 30Ile Asp Thr His Thr Ala Lys TyrAsp Pro Ser Leu Lys Pro Leu Ser 35 40 45Val Cys Tyr Asp Gln Ala ThrSer Leu Arg Ile Leu Asn Asn Gly His 50 55 60Ser Phe Asn Val Glu PheAsp Asp Ser Gln Asp Lys Ala Val Leu Lys65 70 75 80Gly Gly Pro LeuAsp Gly Thr Tyr Arg Leu Ile Gln Phe His Phe His 85 90 95Trp Gly SerLeu Asp Gly Gln Gly Ser Glu His Thr Val Asp Lys Lys 100 105 110LysTyr Ala Ala Glu Leu His Leu Val His Trp Asn Thr Lys Tyr Gly 115 120125Asp Phe Gly Lys Ala Val Gln Gln Pro Asp Gly Leu Ala Val Leu Gly130 135 140Ile Phe Leu Lys Val Gly Ser Ala Lys Pro Gly Leu Gln LysVal Val145 150 155 160Asp Val Leu Asp Ser Ile Lys Thr Lys Gly LysCys Ala Asp Phe Thr 165 170 175Asn Phe Asp Pro Arg Gly Leu Leu ProAla Ser Leu Asp Tyr Trp Thr 180 185 190Tyr Pro Gly Ser Leu Thr ThrPro Pro Leu Leu Glu Cys Val Thr Trp 195 200 205Ile Val Leu Lys GluPro Ile Ser Val Ser Ser Glu Gln Met Leu Lys 210 215 220Phe Arg LysLeu Asn Phe Asn Gly Glu Gly Glu Pro Glu Glu Leu Met225 230 235240Val Asp Asn Trp Arg Pro Ala Gln Pro Leu Lys Lys Arg Gln Ile Lys245 250 255Ala Ser Phe Lys 26027260PRTPongo abelii 27Met Ser HisHis Trp Gly Tyr Gly Lys His Asn Gly Pro Glu His Trp1 5 10 15His LysAsp Phe Pro Ile Ala Lys Gly Glu Arg Gln Ser Pro Val Asp 20 25 30IleAsp Thr His Thr Ala Lys Tyr Asp Pro Ser Leu Lys Pro Leu Ser 35 4045Val Cys Tyr Asp Gln Ala Thr Ser Leu Arg Ile Leu Asn Asn Gly His50 55 60Ser Phe Asn Val Glu Phe Asp Asp Ser Gln Asp Lys Ala Val LeuLys65 70 75 80Gly Gly Pro Leu Asp Gly Thr Tyr Arg Leu Ile Gln PheHis Phe His 85 90 95Trp Gly Ser Leu Asp Gly Gln Gly Ser Glu His ThrVal Asp Lys Lys 100 105 110Lys Tyr Ala Ala Glu Leu His Leu Val HisTrp Asn Thr Lys Tyr Gly 115 120 125Asp Phe Gly Lys Ala Val Gln GlnPro Asp Gly Leu Ala Val Leu Gly 130 135 140Ile Phe Leu Lys Val GlySer Ala Lys Pro Gly Leu Gln Lys Val Val145 150 155 160Asp Val LeuAsp Ser Ile Lys Thr Lys Gly Lys Cys Ala Asp Phe Thr 165 170 175AsnPhe Asp Pro Arg Gly Leu Leu Pro Ala Ser Leu Asp Tyr Trp Thr 180 185190Tyr Pro Gly Ser Leu Thr Thr Pro Pro Leu Leu Glu Cys Val Thr Trp195 200 205Ile Val Leu Lys Glu Pro Ile Ser Val Ser Ser Glu Gln MetLeu Lys 210 215 220Phe Arg Lys Leu Asn Phe Asn Gly Glu Gly Glu ProGlu Glu Leu Met225 230 235 240Val Asp Asn Trp Arg Pro Ala Gln ProLeu Lys Lys Arg Gln Ile Lys 245 250 255Ala Ser Phe Lys26028260PRTCallithrix jacchus 28Met Ser His His Trp Gly Tyr Gly LysHis Asn Gly Pro Glu His Trp1 5 10 15His Lys Asp Phe Pro Ile Ala LysGly Glu Arg Gln Ser Pro Val Asp 20 25 30Ile Asp Thr His Thr Ala LysTyr Asp Pro Ser Leu Lys Pro Leu Ser 35 40 45Val Ser Tyr Asp Gln AlaThr Ser Trp Arg Ile Leu Asn Asn Gly His 50 55 60Ser Phe Asn Val GluPhe Asp Asp Ser Gln Asp Lys Ala Val Leu Lys65 70 75 80Gly Gly ProLeu Asp Gly Thr Tyr Arg Leu Ile Gln Phe His Phe His 85 90 95Trp GlySer Thr Asp Gly Gln Gly Ser Glu His Thr Val Asp Lys Lys 100 105110Lys Tyr Ala Ala Glu Leu His Leu Val His Trp Asn Thr Lys Tyr Gly115 120 125Asp Phe Gly Lys Ala Ala Gln Gln Pro Asp Gly Leu Ala ValLeu Gly
130 135 140Ile Phe Leu Lys Val Gly Ser Ala Lys Pro Gly Leu Gln LysVal Val145 150 155 160Asp Val Leu Asp Ser Ile Lys Thr Lys Gly LysSer Ala Asp Phe Thr 165 170 175Asn Phe Asp Pro Arg Gly Leu Leu ProGlu Ser Leu Asp Tyr Trp Thr 180 185 190Tyr Pro Gly Ser Leu Thr ThrPro Pro Leu Leu Glu Ser Val Thr Trp 195 200 205Ile Val Leu Lys GluPro Ile Ser Val Ser Ser Glu Gln Ile Leu Lys 210 215 220Phe Arg LysLeu Asn Phe Ser Gly Glu Gly Glu Pro Glu Glu Leu Met225 230 235240Val Asp Asn Trp Arg Pro Ala Gln Pro Leu Lys Asn Arg Gln Ile Lys245 250 255Ala Ser Phe Lys 26029260PRTLemur catta 29Met Ser His HisTrp Gly Tyr Gly Lys His Asn Gly Pro Glu His Trp1 5 10 15His Lys AspPhe Pro Ile Ala Lys Gly Glu Arg Gln Ser Pro Val Asp 20 25 30Ile AsnThr Gly Ala Ala Lys His Asp Pro Ser Leu Lys Pro Leu Ser 35 40 45ValTyr Tyr Glu Gln Ala Thr Ser Arg Arg Ile Leu Asn Asn Gly His 50 5560Ser Phe Asn Val Glu Phe Asp Asp Ser Gln Asp Lys Ala Val Leu Lys6570 75 80Gly Gly Pro Leu Asp Gly Thr Tyr Arg Leu Ile Gln Phe His PheHis 85 90 95Trp Gly Ser Leu Asp Gly Gln Gly Ser Glu His Thr Val AspLys Lys 100 105 110Lys Tyr Ala Ala Glu Leu His Leu Val His Trp AsnThr Lys Tyr Gly 115 120 125Asp Phe Gly Lys Ala Val Gln Gln Pro AspGly Leu Ala Val Leu Gly 130 135 140Ile Phe Leu Lys Val Gly Ser AlaLys Pro Gly Leu Gln Lys Val Val145 150 155 160Asp Val Leu Asp SerIle Lys Thr Lys Gly Lys Ser Ala Asp Phe Thr 165 170 175Asn Phe AspPro Arg Gly Leu Leu Pro Glu Ser Leu Asp Tyr Trp Thr 180 185 190TyrLeu Gly Ser Leu Thr Thr Pro Pro Leu Leu Glu Cys Val Thr Trp 195 200205Ile Val Leu Lys Glu Pro Ile Ser Val Ser Ser Glu Gln Met Met Lys210 215 220Phe Arg Lys Leu Ser Phe Ser Gly Glu Gly Glu Pro Glu GluLeu Met225 230 235 240Val Asp Asn Trp Arg Pro Ala Gln Pro Leu LysAsn Arg Gln Ile Lys 245 250 255Ala Ser Phe Lys26030260PRTAiluropoda melanoleuca 30Met Ala His His Trp Gly Tyr GlyLys His Asn Gly Pro Glu His Trp1 5 10 15Tyr Lys Asp Phe Pro Ile AlaLys Gly Gln Arg Gln Ser Pro Val Asp 20 25 30Ile Asp Thr Lys Ala AlaIle His Asp Pro Ala Leu Lys Ala Leu Cys 35 40 45Pro Thr Tyr Glu GlnAla Val Ser Gln Arg Val Ile Asn Asn Gly His 50 55 60Ser Phe Asn ValGlu Phe Asp Asp Ser Gln Asp Asn Ala Val Leu Lys65 70 75 80Gly GlyPro Leu Thr Gly Thr Tyr Arg Leu Ile Gln Phe His Phe His 85 90 95TrpGly Ser Ser Asp Gly Gln Gly Ser Glu His Thr Val Asp Lys Lys 100 105110Lys Tyr Ala Ala Glu Leu His Leu Val His Trp Asn Thr Lys Tyr Gly115 120 125Asp Phe Gly Lys Ala Val Gln Gln Pro Asp Gly Leu Ala ValLeu Gly 130 135 140Ile Phe Leu Lys Ile Gly Asp Ala Arg Pro Gly LeuGln Lys Val Leu145 150 155 160Asp Ala Leu Asp Ser Ile Lys Thr LysGly Lys Ser Ala Asp Phe Thr 165 170 175Asn Phe Asp Pro Arg Gly LeuLeu Pro Glu Ser Leu Asp Tyr Trp Thr 180 185 190Tyr Pro Gly Ser LeuThr Thr Pro Pro Leu Leu Glu Cys Val Thr Trp 195 200 205Ile Val LeuLys Glu Pro Ile Ser Val Ser Ser Glu Gln Met Leu Lys 210 215 220PheArg Arg Leu Asn Phe Asn Lys Glu Gly Glu Pro Glu Glu Leu Met225 230235 240Val Asp Asn Trp Arg Pro Ala Gln Pro Leu His Asn Arg Gln IleAsn 245 250 255Ala Ser Phe Lys 26031260PRTEquus caballus 31Met SerHis His Trp Gly Tyr Gly Gln His Asn Gly Pro Lys His Trp1 5 10 15HisLys Asp Phe Pro Ile Ala Lys Gly Gln Arg Gln Ser Pro Val Asp 20 2530Ile Asp Thr Lys Ala Ala Val His Asp Ala Ala Leu Lys Pro Leu Ala35 40 45Val His Tyr Glu Gln Ala Thr Ser Arg Arg Ile Val Asn Asn GlyHis 50 55 60Ser Phe Asn Val Glu Phe Asp Asp Ser Gln Asp Lys Ala ValLeu Gln65 70 75 80Gly Gly Pro Leu Thr Gly Thr Tyr Arg Leu Ile GlnPhe His Phe His 85 90 95Trp Gly Ser Ser Asp Gly Gln Gly Ser Glu HisThr Val Asp Lys Lys 100 105 110Lys Tyr Ala Ala Glu Leu His Leu ValHis Trp Asn Thr Lys Tyr Gly 115 120 125Asp Phe Gly Lys Ala Val GlnGln Pro Asp Gly Leu Ala Val Val Gly 130 135 140Val Phe Leu Lys ValGly Gly Ala Lys Pro Gly Leu Gln Lys Val Leu145 150 155 160Asp ValLeu Asp Ser Ile Lys Thr Lys Gly Lys Ser Ala Asp Phe Thr 165 170175Asn Phe Asp Pro Arg Gly Leu Leu Pro Glu Ser Leu Asp Tyr Trp Thr180 185 190Tyr Pro Gly Ser Leu Thr Thr Pro Pro Leu Leu Glu Cys ValThr Trp 195 200 205Ile Val Leu Arg Glu Pro Ile Ser Val Ser Ser GluGln Leu Leu Lys 210 215 220Phe Arg Ser Leu Asn Phe Asn Ala Glu GlyLys Pro Glu Asp Pro Met225 230 235 240Val Asp Asn Trp Arg Pro AlaGln Pro Leu Asn Ser Arg Gln Ile Arg 245 250 255Ala Ser Phe Lys26032260PRTCanis lupus 32Met Ala His His Trp Gly Tyr Ala Lys HisAsn Gly Pro Glu His Trp1 5 10 15His Lys Asp Phe Pro Ile Ala Lys GlyGlu Arg Gln Ser Pro Val Asp 20 25 30Ile Asp Thr Lys Ala Ala Val HisAsp Pro Ala Leu Lys Ser Leu Cys 35 40 45Pro Cys Tyr Asp Gln Ala ValSer Gln Arg Ile Ile Asn Asn Gly His 50 55 60Ser Phe Asn Val Glu PheAsp Asp Ser Gln Asp Lys Thr Val Leu Lys65 70 75 80Gly Gly Pro LeuThr Gly Thr Tyr Arg Leu Ile Gln Phe His Phe His 85 90 95Trp Gly SerSer Asp Gly Gln Gly Ser Glu His Thr Val Asp Lys Lys 100 105 110LysTyr Ala Ala Glu Leu His Leu Val His Trp Asn Thr Lys Tyr Gly 115 120125Glu Phe Gly Lys Ala Val Gln Gln Pro Asp Gly Leu Ala Val Leu Gly130 135 140Ile Phe Leu Lys Ile Gly Gly Ala Asn Pro Gly Leu Gln LysIle Leu145 150 155 160Asp Ala Leu Asp Ser Ile Lys Thr Lys Gly LysSer Ala Asp Phe Thr 165 170 175Asn Phe Asp Pro Arg Gly Leu Leu ProGlu Ser Leu Asp Tyr Trp Thr 180 185 190Tyr Pro Gly Ser Leu Thr ThrPro Pro Leu Leu Glu Cys Val Thr Trp 195 200 205Ile Val Leu Lys GluPro Ile Ser Val Ser Ser Glu Gln Met Leu Lys 210 215 220Phe Arg LysLeu Asn Phe Asn Lys Glu Gly Glu Pro Glu Glu Leu Met225 230 235240Met Asp Asn Trp Arg Pro Ala Gln Pro Leu His Ser Arg Gln Ile Asn245 250 255Ala Ser Phe Lys 26033260PRTOryctolagus cuniculus 33MetSer His His Trp Gly Tyr Gly Lys His Asn Gly Pro Glu His Trp1 5 1015His Lys Asp Phe Pro Ile Ala Asn Gly Glu Arg Gln Ser Pro Ile Asp20 25 30Ile Asp Thr Asn Ala Ala Lys His Asp Pro Ser Leu Lys Pro LeuArg 35 40 45Val Cys Tyr Glu His Pro Ile Ser Arg Arg Ile Ile Asn AsnGly His 50 55 60Ser Phe Asn Val Glu Phe Asp Asp Ser His Asp Lys ThrVal Leu Lys65 70 75 80Glu Gly Pro Leu Glu Gly Thr Tyr Arg Leu IleGln Phe His Phe His 85 90 95Trp Gly Ser Ser Asp Gly Gln Gly Ser GluHis Thr Val Asn Lys Lys 100 105 110Lys Tyr Ala Ala Glu Leu His LeuVal His Trp Asn Thr Lys Tyr Gly 115 120 125Asp Phe Gly Lys Ala ValLys His Pro Asp Gly Leu Ala Val Leu Gly 130 135 140Ile Phe Leu LysIle Gly Ser Ala Thr Pro Gly Leu Gln Lys Val Val145 150 155 160AspThr Leu Ser Ser Ile Lys Thr Lys Gly Lys Ser Val Asp Phe Thr 165 170175Asp Phe Asp Pro Arg Gly Leu Leu Pro Glu Ser Leu Asp Tyr Trp Thr180 185 190Tyr Pro Gly Ser Leu Thr Thr Pro Pro Leu Leu Glu Cys ValThr Trp 195 200 205Ile Val Leu Lys Glu Pro Ile Thr Val Ser Ser GluGln Met Leu Lys 210 215 220Phe Arg Asn Leu Asn Phe Asn Lys Glu AlaGlu Pro Glu Glu Pro Met225 230 235 240Val Asp Asn Trp Arg Pro ThrGln Pro Leu Lys Gly Arg Gln Val Lys 245 250 255Ala Ser Phe Val26034249PRTAiluropoda melanoleuca 34Gly Pro Glu His Trp Tyr Lys AspPhe Pro Ile Ala Lys Gly Gln Arg1 5 10 15Gln Ser Pro Val Asp Ile AspThr Lys Ala Ala Ile His Asp Pro Ala 20 25 30Leu Lys Ala Leu Cys ProThr Tyr Glu Gln Ala Val Ser Gln Arg Val 35 40 45Ile Asn Asn Gly HisSer Phe Asn Val Glu Phe Asp Asp Ser Gln Asp 50 55 60Asn Ala Val LeuLys Gly Gly Pro Leu Thr Gly Thr Tyr Arg Leu Ile65 70 75 80Gln PheHis Phe His Trp Gly Ser Ser Asp Gly Gln Gly Ser Glu His 85 90 95ThrVal Asp Lys Lys Lys Tyr Ala Ala Glu Leu His Leu Val His Trp 100 105110Asn Thr Lys Tyr Gly Asp Phe Gly Lys Ala Val Gln Gln Pro Asp Gly115 120 125Leu Ala Val Leu Gly Ile Phe Leu Lys Ile Gly Asp Ala ArgPro Gly 130 135 140Leu Gln Lys Val Leu Asp Ala Leu Asp Ser Ile LysThr Lys Gly Lys145 150 155 160Ser Ala Asp Phe Thr Asn Phe Asp ProArg Gly Leu Leu Pro Glu Ser 165 170 175Leu Asp Tyr Trp Thr Tyr ProGly Ser Leu Thr Thr Pro Pro Leu Leu 180 185 190Glu Cys Val Thr TrpIle Val Leu Lys Glu Pro Ile Ser Val Ser Ser 195 200 205Glu Gln MetLeu Lys Phe Arg Arg Leu Asn Phe Asn Lys Glu Gly Glu 210 215 220ProGlu Glu Leu Met Val Asp Asn Trp Arg Pro Ala Gln Pro Leu His225 230235 240Asn Arg Gln Ile Asn Ala Ser Phe Lys 24535260PRTSus scrofa35Met Ser His His Trp Gly Tyr Asp Lys His Asn Gly Pro Glu His Trp15 10 15His Lys Asp Phe Pro Ile Ala Lys Gly Asp Arg Gln Ser Pro ValAsp 20 25 30Ile Asn Thr Ser Thr Ala Val His Asp Pro Ala Leu Lys ProLeu Ser 35 40 45Leu Cys Tyr Glu Gln Ala Thr Ser Gln Arg Ile Val AsnAsn Gly His 50 55 60Ser Phe Asn Val Glu Phe Asp Ser Ser Gln Asp LysGly Val Leu Glu65 70 75 80Gly Gly Pro Leu Ala Gly Thr Tyr Arg LeuIle Gln Phe His Phe His 85 90 95Trp Gly Ser Ser Asp Gly Gln Gly SerGlu His Thr Val Asp Lys Lys 100 105 110Lys Tyr Ala Ala Glu Leu HisLeu Val His Trp Asn Thr Lys Tyr Lys 115 120 125Asp Phe Gly Glu AlaAla Gln Gln Pro Asp Gly Leu Ala Val Leu Gly 130 135 140Val Phe LeuLys Ile Gly Asn Ala Gln Pro Gly Leu Gln Lys Ile Val145 150 155160Asp Val Leu Asp Ser Ile Lys Thr Lys Gly Lys Ser Val Glu Phe Thr165 170 175Gly Phe Asp Pro Arg Asp Leu Leu Pro Gly Ser Leu Asp TyrTrp Thr 180 185 190Tyr Pro Gly Ser Leu Thr Thr Pro Pro Leu Leu GluSer Val Thr Trp 195 200 205Ile Val Leu Arg Glu Pro Ile Ser Val SerSer Gly Gln Met Met Lys 210 215 220Phe Arg Thr Leu Asn Phe Asn LysGlu Gly Glu Pro Glu His Pro Met225 230 235 240Val Asp Asn Trp ArgPro Thr Gln Pro Leu Lys Asn Arg Gln Ile Arg 245 250 255Ala Ser PheGln 26036235PRTCallithrix jacchus 36Met Ser His His Trp Gly Tyr GlyLys His Asn Gly Pro Glu His Trp1 5 10 15His Lys Asp Phe Pro Ile AlaLys Gly Glu Arg Gln Ser Pro Val Asp 20 25 30Ile Asp Thr His Thr AlaLys Tyr Asp Pro Ser Leu Lys Pro Leu Ser 35 40 45Val Ser Tyr Asp GlnAla Thr Ser Trp Arg Ile Leu Asn Asn Gly His 50 55 60Ser Phe Asn ValGlu Phe Asp Asp Ser Gln Asp Lys Ala Val Leu Lys65 70 75 80Gly GlyPro Leu Asp Gly Thr Tyr Arg Leu Ile Gln Leu His Leu Val 85 90 95HisTrp Asn Thr Lys Tyr Gly Asp Phe Gly Lys Ala Ala Gln Gln Pro 100 105110Asp Gly Leu Ala Val Leu Gly Ile Phe Leu Lys Val Gly Ser Ala Lys115 120 125Pro Gly Leu Gln Lys Val Val Asp Val Leu Asp Ser Ile LysThr Lys 130 135 140Gly Lys Ser Ala Asp Phe Thr Asn Phe Asp Pro ArgGly Leu Leu Pro145 150 155 160Glu Ser Leu Asp Tyr Trp Thr Tyr ProGly Ser Leu Thr Thr Pro Pro 165 170 175Leu Leu Glu Ser Val Thr TrpIle Val Leu Lys Glu Pro Ile Ser Val 180 185 190Ser Ser Glu Gln IleLeu Lys Phe Arg Lys Leu Asn Phe Ser Gly Glu 195 200 205Gly Glu ProGlu Glu Leu Met Val Asp Asn Trp Arg Pro Ala Gln Pro 210 215 220LeuLys Asn Arg Gln Ile Lys Ala Ser Phe Lys225 230 23537260PRTMusmusculus 37Met Ser His His Trp Gly Tyr Ser Lys His Asn Gly Pro GluAsn Trp1 5 10 15His Lys Asp Phe Pro Ile Ala Asn Gly Asp Arg Gln SerPro Val Asp 20 25 30Ile Asp Thr Ala Thr Ala Gln His Asp Pro Ala LeuGln Pro Leu Leu 35 40 45Ile Ser Tyr Asp Lys Ala Ala Ser Lys Ser IleVal Asn Asn Gly His 50 55 60Ser Phe Asn Val Glu Phe Asp Asp Ser GlnAsp Asn Ala Val Leu Lys65 70 75 80Gly Gly Pro Leu Ser Asp Ser TyrArg Leu Ile Gln Phe His Phe His 85 90 95Trp Gly Ser Ser Asp Gly GlnGly Ser Glu His Thr Val Asn Lys Lys 100 105 110Lys Tyr Ala Ala GluLeu His Leu Val His Trp Asn Thr Lys Tyr Gly 115 120 125Asp Phe GlyLys Ala Val Gln Gln Pro Asp Gly Leu Ala Val Leu Gly 130 135 140IlePhe Leu Lys Ile Gly Pro Ala Ser Gln Gly Leu Gln Lys Val Leu145 150155 160Glu Ala Leu His Ser Ile Lys Thr Lys Gly Lys Arg Ala Ala PheAla 165 170 175Asn Phe Asp Pro Cys Ser Leu Leu Pro Gly Asn Leu AspTyr Trp Thr 180 185 190Tyr Pro Gly Ser Leu Thr Thr Pro Pro Leu LeuGlu Cys Val Thr Trp 195 200 205Ile Val Leu Arg Glu Pro Ile Thr ValSer Ser Glu Gln Met Ser His 210 215 220Phe Arg Thr Leu Asn Phe AsnGlu Glu Gly Asp Ala Glu Glu Ala Met225 230 235 240Val Asp Asn TrpArg Pro Ala Gln Pro Leu Lys Asn Arg Lys Ile Lys 245 250 255Ala SerPhe Lys 26038260PRTBos taurus 38Met Ser His His Trp Gly Tyr Gly LysHis Asn Gly Pro Glu His Trp1 5 10 15His Lys Asp Phe
Pro Ile Ala Asn Gly Glu Arg Gln Ser Pro Val Asp 20 25 30Ile Asp ThrLys Ala Val Val Gln Asp Pro Ala Leu Lys Pro Leu Ala 35 40 45Leu ValTyr Gly Glu Ala Thr Ser Arg Arg Met Val Asn Asn Gly His 50 55 60SerPhe Asn Val Glu Tyr Asp Asp Ser Gln Asp Lys Ala Val Leu Lys65 70 7580Asp Gly Pro Leu Thr Gly Thr Tyr Arg Leu Val Gln Phe His Phe His85 90 95Trp Gly Ser Ser Asp Asp Gln Gly Ser Glu His Thr Val Asp ArgLys 100 105 110Lys Tyr Ala Ala Glu Leu His Leu Val His Trp Asn ThrLys Tyr Gly 115 120 125Asp Phe Gly Thr Ala Ala Gln Gln Pro Asp GlyLeu Ala Val Val Gly 130 135 140Val Phe Leu Lys Val Gly Asp Ala AsnPro Ala Leu Gln Lys Val Leu145 150 155 160Asp Ala Leu Asp Ser IleLys Thr Lys Gly Lys Ser Thr Asp Phe Pro 165 170 175Asn Phe Asp ProGly Ser Leu Leu Pro Asn Val Leu Asp Tyr Trp Thr 180 185 190Tyr ProGly Ser Leu Thr Thr Pro Pro Leu Leu Glu Ser Val Thr Trp 195 200205Ile Val Leu Lys Glu Pro Ile Ser Val Ser Ser Gln Gln Met Leu Lys210 215 220Phe Arg Thr Leu Asn Phe Asn Ala Glu Gly Glu Pro Glu LeuLeu Met225 230 235 240Leu Ala Asn Trp Arg Pro Ala Gln Pro Leu LysAsn Arg Gln Val Arg 245 250 255Gly Phe Pro Lys26039232PRTOryctolagus cuniculus 39Gly Lys His Asn Gly Pro Glu HisTrp His Lys Asp Phe Pro Ile Ala1 5 10 15Asn Gly Glu Arg Gln Ser ProIle Asp Ile Asp Thr Asn Ala Ala Lys 20 25 30His Asp Pro Ser Leu LysPro Leu Arg Val Cys Tyr Glu His Pro Ile 35 40 45Ser Arg Arg Ile IleAsn Asn Gly His Ser Phe Asn Val Glu Phe Asp 50 55 60Asp Ser His AspLys Thr Val Leu Lys Glu Gly Pro Leu Glu Gly Thr65 70 75 80Tyr ArgLeu Ile Gln Phe His Phe His Trp Gly Ser Ser Asp Gly Gln 85 90 95GlySer Glu His Thr Val Asn Lys Lys Lys Tyr Ala Ala Glu Leu His 100 105110Leu Val His Trp Asn Thr Lys Tyr Gly Asp Phe Gly Lys Ala Val Lys115 120 125His Pro Asp Gly Leu Ala Val Leu Gly Ile Phe Leu Lys IleGly Ser 130 135 140Ala Thr Pro Gly Leu Gln Lys Val Val Asp Thr LeuSer Ser Ile Lys145 150 155 160Thr Lys Gly Lys Ser Val Asp Phe ThrAsp Phe Asp Pro Arg Gly Leu 165 170 175Leu Pro Glu Ser Leu Asp TyrTrp Thr Tyr Pro Gly Ser Leu Thr Thr 180 185 190Pro Pro Leu Leu GluCys Val Thr Trp Ile Val Leu Lys Glu Pro Ile 195 200 205Thr Val SerSer Glu Gln Met Leu Lys Phe Arg Asn Leu Asn Phe Asn 210 215 220LysGlu Ala Glu Pro Glu Glu Pro225 23040260PRTRattus norvegicus 40MetSer His His Trp Gly Tyr Ser Lys Ser Asn Gly Pro Glu Asn Trp1 5 1015His Lys Glu Phe Pro Ile Ala Asn Gly Asp Arg Gln Ser Pro Val Asp20 25 30Ile Asp Thr Gly Thr Ala Gln His Asp Pro Ser Leu Gln Pro LeuLeu 35 40 45Ile Cys Tyr Asp Lys Val Ala Ser Lys Ser Ile Val Asn AsnGly His 50 55 60Ser Phe Asn Val Glu Phe Asp Asp Ser Gln Asp Phe AlaVal Leu Lys65 70 75 80Glu Gly Pro Leu Ser Gly Ser Tyr Arg Leu IleGln Phe His Phe His 85 90 95Trp Gly Ser Ser Asp Gly Gln Gly Ser GluHis Thr Val Asn Lys Lys 100 105 110Lys Tyr Ala Ala Glu Leu His LeuVal His Trp Asn Thr Lys Tyr Gly 115 120 125Asp Phe Gly Lys Ala ValGln His Pro Asp Gly Leu Ala Val Leu Gly 130 135 140Ile Phe Leu LysIle Gly Pro Ala Ser Gln Gly Leu Gln Lys Ile Thr145 150 155 160GluAla Leu His Ser Ile Lys Thr Lys Gly Lys Arg Ala Ala Phe Ala 165 170175Asn Phe Asp Pro Cys Ser Leu Leu Pro Gly Asn Leu Asp Tyr Trp Thr180 185 190Tyr Pro Gly Ser Leu Thr Thr Pro Pro Leu Leu Glu Cys ValThr Trp 195 200 205Ile Val Leu Lys Glu Pro Ile Thr Val Ser Ser GluGln Met Ser His 210 215 220Phe Arg Lys Leu Asn Phe Asn Ser Glu GlyGlu Ala Glu Glu Leu Met225 230 235 240Val Asp Asn Trp Arg Pro AlaGln Pro Leu Lys Asn Arg Lys Ile Lys 245 250 255Ala Ser Phe Lys26041208PRTHomo sapiens 41Met Ser Leu Ser Ile Thr Asn Asn Gly HisSer Val Gln Val Asp Phe1 5 10 15Asn Asp Ser Asp Asp Arg Thr Val ValThr Gly Gly Pro Leu Glu Gly 20 25 30Pro Tyr Arg Leu Lys Gln Phe HisPhe His Trp Gly Lys Lys His Asp 35 40 45Val Gly Ser Glu His Thr ValAsp Gly Lys Ser Phe Pro Ser Glu Leu 50 55 60His Leu Val His Trp AsnAla Lys Lys Tyr Ser Thr Phe Gly Glu Ala65 70 75 80Ala Ser Ala ProAsp Gly Leu Ala Val Val Gly Val Phe Leu Glu Thr 85 90 95Gly Asp GluHis Pro Ser Met Asn Arg Leu Thr Asp Ala Leu Tyr Met 100 105 110ValArg Phe Lys Gly Thr Lys Ala Gln Phe Ser Cys Phe Asn Pro Lys 115 120125Cys Leu Leu Pro Ala Ser Arg His Tyr Trp Thr Tyr Pro Gly Ser Leu130 135 140Thr Thr Pro Pro Leu Ser Glu Ser Val Thr Trp Ile Val LeuArg Glu145 150 155 160Pro Ile Cys Ile Ser Glu Arg Gln Met Gly LysPhe Arg Ser Leu Leu 165 170 175Phe Thr Ser Glu Asp Asp Glu Arg IleHis Met Val Asn Asn Phe Arg 180 185 190Pro Pro Gln Pro Leu Lys GlyArg Val Val Lys Ala Ser Phe Arg Ala 195 200 20542264PRTPongo abelii42Met Thr Gly His His Gly Trp Gly Tyr Gly Gln Asp Asp Gly Pro Ser15 10 15His Trp His Lys Leu Tyr Pro Ile Ala Gln Gly Asp Arg Gln SerPro 20 25 30Ile Asn Ile Ile Ser Ser Gln Ala Val Tyr Ser Pro Ser LeuGln Pro 35 40 45Leu Glu Leu Ser Tyr Glu Ala Cys Met Ser Leu Ser IleThr Asn Asn 50 55 60Gly His Ser Val Gln Val Asp Phe Asn Asp Ser AspAsp Arg Thr Val65 70 75 80Val Thr Gly Gly Pro Leu Glu Gly Pro TyrArg Leu Lys Gln Phe His 85 90 95Phe His Trp Gly Lys Lys His Asp ValGly Ser Glu His Thr Val Asp 100 105 110Gly Lys Ser Phe Pro Ser GluLeu His Leu Val His Trp Asn Ala Lys 115 120 125Lys Tyr Ser Thr PheGly Glu Ala Ala Ser Ala Pro Asp Gly Leu Ala 130 135 140Val Val GlyVal Phe Leu Glu Thr Gly Asp Glu His Pro Ser Met Asn145 150 155160Arg Leu Thr Asp Ala Leu Tyr Met Val Arg Phe Lys Gly Thr Lys Ala165 170 175Gln Phe Ser Cys Phe Asn Pro Lys Ser Leu Leu Pro Ala SerArg His 180 185 190Tyr Trp Thr Tyr Pro Gly Ser Leu Thr Thr Pro ProLeu Ser Glu Ser 195 200 205Val Thr Trp Ile Val Leu Arg Glu Pro IleCys Ile Ser Glu Arg Gln 210 215 220Met Gly Lys Phe Arg Ser Leu LeuPhe Thr Ser Glu Asp Asp Glu Arg225 230 235 240Ile His Met Val AsnAsn Phe Arg Pro Pro Gln Pro Leu Lys Gly Arg 245 250 255Val Val LysAla Ser Phe Arg Ala 26043264PRTPan troglodytes 43Met Thr Gly HisHis Gly Trp Gly Tyr Gly Gln Asp Asp Gly Pro Ser1 5 10 15His Trp HisLys Leu Tyr Pro Ile Ala Gln Gly Asp Arg Gln Ser Pro 20 25 30Ile AsnIle Ile Ser Ser Gln Ala Val Tyr Ser Pro Ser Leu Gln Pro 35 40 45LeuGlu Leu Ser Tyr Glu Ala Cys Met Ser Leu Ser Ile Thr Asn Asn 50 5560Gly His Ser Val Gln Val Asp Phe Asn Asp Ser Asp Asp Arg Thr Val6570 75 80Val Thr Gly Gly Pro Leu Glu Gly Pro Tyr Arg Leu Lys Gln PheHis 85 90 95Phe His Trp Gly Lys Lys His Asp Val Gly Ser Glu His ThrVal Asp 100 105 110Gly Lys Ser Phe Pro Ser Glu Leu His Leu Val HisTrp Asn Ala Lys 115 120 125Lys Tyr Ser Thr Phe Gly Glu Ala Ala SerAla Pro Asp Gly Leu Ala 130 135 140Val Val Gly Val Phe Leu Glu ThrGly Asp Glu His Pro Ser Met Asn145 150 155 160Arg Leu Thr Asp AlaLeu Tyr Met Val Arg Phe Lys Gly Thr Lys Ala 165 170 175Gln Phe SerCys Phe Asn Pro Lys Cys Leu Leu Pro Ala Ser Arg His 180 185 190TyrTrp Thr Tyr Pro Gly Ser Leu Thr Thr Pro Pro Leu Ser Glu Ser 195 200205Val Thr Trp Ile Val Leu Arg Glu Pro Ile Cys Ile Ser Glu Arg Gln210 215 220Met Arg Lys Phe Arg Ser Leu Leu Phe Thr Ser Glu Asp AspGlu Arg225 230 235 240Ile His Met Val Asn Asn Phe Arg Pro Pro GlnPro Leu Lys Gly Arg 245 250 255Val Val Lys Ala Ser Phe Arg Ala26044264PRTCallithrix jacchus 44Met Thr Gly His His Gly Trp Gly TyrGly Gln Asp Asp Gly Pro Ser1 5 10 15His Trp His Lys Leu Tyr Pro IleAla Gln Gly Asp Arg Gln Ser Pro 20 25 30Ile Asn Ile Ile Ser Ser GlnAla Val Tyr Ser Pro Ser Leu Gln Pro 35 40 45Leu Glu Leu Ser Tyr GluAla Cys Met Ser Leu Ser Ile Thr Asn Asn 50 55 60Gly His Ser Val GlnVal Asp Phe Asn Asp Ser Asp Asp Arg Thr Val65 70 75 80Val Thr GlyGly Pro Leu Glu Gly Pro Tyr Arg Leu Lys Gln Phe His 85 90 95Phe HisTrp Gly Lys Lys His Asp Val Gly Ser Glu His Thr Val Asp 100 105110Gly Lys Ser Phe Pro Ser Glu Leu His Leu Val His Trp Asn Ala Lys115 120 125Lys Tyr Ser Thr Phe Gly Glu Ala Ala Ser Ala Pro Asp GlyLeu Ala 130 135 140Val Val Gly Val Phe Leu Glu Thr Gly Asp Glu HisPro Ser Met Asn145 150 155 160Arg Leu Thr Asp Ala Leu Tyr Met ValArg Phe Lys Gly Thr Lys Ala 165 170 175Gln Phe Ser Cys Phe Asn ProLys Cys Leu Leu Pro Ala Ser Trp His 180 185 190Tyr Trp Thr Tyr ProGly Ser Leu Thr Thr Pro Pro Leu Ser Glu Ser 195 200 205Val Thr TrpIle Val Leu Arg Glu Pro Ile Cys Ile Ser Glu Arg Gln 210 215 220MetGly Lys Phe Arg Ser Leu Leu Phe Thr Ser Glu Asp Asp Glu Arg225 230235 240Val His Met Val Asn Asn Phe Arg Pro Pro Gln Pro Leu Lys GlyArg 245 250 255Val Val Lys Ala Ser Phe Arg Ala26045251PRTAiluropoda melanoleuca 45Gly Pro Ser Gln Trp His Lys LeuTyr Pro Ile Ala Gln Gly Asp Arg1 5 10 15Gln Ser Pro Ile Asn Ile ValSer Ser Gln Ala Val Tyr Ser Pro Ser 20 25 30Leu Lys Pro Leu Glu LeuSer Tyr Glu Ala Cys Ile Ser Leu Ser Ile 35 40 45Ala Asn Asn Gly HisSer Val Gln Val Asp Phe Asn Asp Ser Asp Asp 50 55 60Arg Thr Val ValThr Gly Gly Pro Leu Asp Gly Pro Tyr Arg Leu Lys65 70 75 80Gln PheHis Phe His Trp Gly Lys Lys His Ser Val Gly Ser Glu His 85 90 95ThrVal Asp Gly Lys Ser Phe Pro Ser Glu Leu His Leu Val His Trp 100 105110Asn Ala Lys Lys Tyr Ser Thr Phe Gly Glu Ala Ala Ser Ala Pro Asp115 120 125Gly Leu Ala Val Val Gly Val Phe Leu Glu Thr Gly Asp GluHis Pro 130 135 140Ser Met Asn Arg Leu Thr Asp Ala Leu Tyr Met ValArg Phe Lys Gly145 150 155 160Thr Lys Ala Gln Phe Ser Cys Phe AsnPro Lys Cys Leu Leu Pro Ala 165 170 175Ser Arg His Tyr Trp Thr TyrPro Gly Ser Leu Thr Thr Pro Pro Leu 180 185 190Ser Glu Ser Val ThrTrp Ile Val Leu Arg Glu Pro Ile Ser Ile Ser 195 200 205Glu Arg GlnMet Glu Lys Phe Arg Ser Leu Leu Phe Thr Ser Glu Asp 210 215 220AspGlu Arg Ile His Met Val Asn Asn Phe Arg Pro Pro Gln Pro Leu225 230235 240Lys Gly Arg Val Val Lys Ala Ser Phe Arg Ala 24525046264PRTCanis familiaris 46Met Thr Gly His His Cys Trp Gly TyrGly Gln Asn Asp Gly Pro Ser1 5 10 15Gln Trp His Lys Leu Tyr Pro IleAla Gln Gly Asp Arg Gln Ser Pro 20 25 30Ile Asn Ile Val Ser Ser GlnAla Val Tyr Ser Pro Ser Leu Lys Pro 35 40 45Leu Glu Leu Ser Tyr GluAla Cys Ile Ser Leu Ser Ile Thr Asn Asn 50 55 60Gly His Ser Val GlnVal Asp Phe Asn Asp Ser Asp Asp Arg Thr Ala65 70 75 80Val Thr GlyGly Pro Leu Asp Gly Pro Tyr Arg Leu Lys Gln Leu His 85 90 95Phe HisTrp Gly Lys Lys His Ser Val Gly Ser Glu His Thr Val Asp 100 105110Gly Lys Ser Phe Pro Ser Glu Leu His Leu Val His Trp Asn Ala Lys115 120 125Lys Tyr Ser Thr Phe Gly Glu Ala Ala Ser Ala Pro Asp GlyLeu Ala 130 135 140Val Val Gly Ile Phe Leu Glu Thr Gly Asp Glu HisPro Ser Met Asn145 150 155 160Arg Leu Thr Asp Ala Leu Tyr Met ValArg Phe Lys Gly Thr Lys Ala 165 170 175Gln Phe Ser Cys Phe Asn ProLys Cys Leu Leu Pro Ala Ser Arg His 180 185 190Tyr Trp Thr Tyr ProGly Ser Leu Thr Thr Pro Pro Leu Ser Glu Ser 195 200 205Val Thr TrpIle Val Leu Arg Glu Pro Ile Ser Ile Ser Glu Arg Gln 210 215 220MetGlu Lys Phe Arg Ser Leu Leu Phe Thr Ser Glu Glu Asp Glu Arg225 230235 240Ile His Met Val Asn Asn Phe Arg Pro Pro Gln Pro Leu Lys GlyArg 245 250 255Val Val Lys Ala Ser Phe Arg Ala 26047264PRTBostaurus 47Met Thr Gly His His Gly Trp Gly Tyr Gly Gln Asn Asp GlyPro Ser1 5 10 15His Trp His Lys Leu Tyr Pro Ile Ala Gln Gly Asp ArgGln Ser Pro 20 25 30Ile Asn Ile Val Ser Ser Gln Ala Val Tyr Ser ProSer Leu Lys Pro 35 40 45Leu Glu Ile Ser Tyr Glu Ser Cys Thr Ser LeuSer Ile Ala Asn Asn 50 55 60Gly His Ser Val Gln Val Asp Phe Asn AspSer Asp Asp Arg Thr Val65 70 75 80Val Ser Gly Gly Pro Leu Asp GlyPro Tyr Arg Leu Lys Gln Phe His 85 90 95Phe His Trp Gly Lys Lys HisGly Val Gly Ser Glu His Thr Val Asp 100 105 110Gly Lys Ser Phe ProSer Glu Leu His Leu Val His Trp Asn Ala Lys 115 120 125Lys Tyr SerThr Phe Gly Glu Ala Ala Ser Ala Pro Asp Gly Leu Ala 130 135 140ValVal Gly Val Phe Leu Glu Thr Gly Asp Glu His Pro Ser Met Asn145 150155 160Arg Leu Thr Asp Ala Leu Tyr Met Val Arg Phe Lys Gly Thr LysAla 165 170 175Gln Phe Ser Cys Phe Asn Pro Lys Cys Leu Leu Pro AlaSer Arg His 180 185 190Tyr Trp Thr Tyr Pro Gly Ser Leu Thr Thr ProPro Leu Ser Glu Ser 195
200 205Val Thr Trp Ile Val Leu Arg Glu Pro Ile Arg Ile Ser Glu ArgGln 210 215 220Met Glu Lys Phe Arg Ser Leu Leu Phe Thr Ser Glu GluAsp Glu Arg225 230 235 240Ile His Met Val Asn Asn Phe Arg Pro ProGln Pro Leu Lys Gly Arg 245 250 255Val Val Lys Ala Ser Phe Arg Ala26048271PRTRattus norvegicus 48Met Thr Val Leu Trp Trp Pro Met LeuArg Glu Glu Leu Met Ser Lys1 5 10 15Leu Arg Thr Gly Gly Pro Ser AsnTrp His Lys Leu Tyr Pro Ile Ala 20 25 30Gln Gly Asp Arg Gln Ser ProIle Asn Ile Ile Ser Ser Gln Ala Val 35 40 45Tyr Ser Pro Ser Leu GlnPro Leu Glu Leu Phe Tyr Glu Ala Cys Met 50 55 60Ser Leu Ser Ile ThrAsn Asn Gly His Ser Val Gln Val Asp Phe Asn65 70 75 80Asp Ser AspAsp Arg Thr Val Val Ala Gly Gly Pro Leu Glu Gly Pro 85 90 95Tyr ArgLeu Lys Gln Leu His Phe His Trp Gly Lys Lys Arg Asp Val 100 105110Gly Ser Glu His Thr Val Asp Gly Lys Ser Phe Pro Ser Glu Leu His115 120 125Leu Val His Trp Asn Ala Lys Lys Tyr Ser Thr Phe Gly GluAla Ala 130 135 140Ala Ala Pro Asp Gly Leu Ala Val Val Gly Ile PheLeu Glu Thr Gly145 150 155 160Asp Glu His Pro Ser Met Asn Arg LeuThr Asp Ala Leu Tyr Met Val 165 170 175Arg Phe Lys Asp Thr Lys AlaGln Phe Ser Cys Phe Asn Pro Lys Cys 180 185 190Leu Leu Pro Thr SerArg His Tyr Trp Thr Tyr Pro Gly Ser Leu Thr 195 200 205Thr Pro ProLeu Ser Glu Ser Val Thr Trp Ile Val Leu Arg Glu Pro 210 215 220IleArg Ile Ser Glu Arg Gln Met Glu Lys Phe Arg Ser Leu Leu Phe225 230235 240Thr Ser Glu Asp Asp Glu Arg Ile His Met Val Asn Asn Phe ArgPro 245 250 255Pro Gln Pro Leu Lys Gly Arg Val Val Lys Ala Ser PheGln Ser 260 265 27049266PRTOryctolagus cuniculus 49Met Thr Gly HisHis Gly Trp Gly Tyr Gly Gln Asp Asp Gly Gly Arg1 5 10 15Pro Ser HisTrp His Lys Leu Tyr Pro Ile Ala Gln Gly Asp Arg Gln 20 25 30Ser ProIle Asn Ile Val Ser Ser Gln Ala Val Tyr Ser Pro Gly Leu 35 40 45GlnPro Leu Glu Leu Ser Tyr Glu Ala Cys Thr Ser Leu Ser Ile Ala 50 5560Asn Asn Gly His Ser Val Gln Val Asp Phe Asn Asp Ser Asp Asp Arg6570 75 80Thr Val Val Thr Gly Gly Pro Leu Glu Gly Pro Tyr Arg Leu LysGln 85 90 95Phe His Phe His Trp Gly Lys Arg Arg Asp Ala Gly Ser GluHis Thr 100 105 110Val Asp Gly Lys Ser Phe Pro Ser Glu Leu His LeuVal His Trp Asn 115 120 125Ala Arg Lys Tyr Ser Thr Phe Gly Glu AlaAla Ser Ala Pro Asp Gly 130 135 140Leu Ala Val Val Gly Val Phe LeuGlu Thr Gly Asn Glu His Pro Ser145 150 155 160Met Asn Arg Leu ThrAsp Ala Leu Tyr Met Val Arg Phe Lys Gly Thr 165 170 175Lys Ala GlnPhe Ser Cys Phe Asn Pro Lys Cys Leu Leu Pro Ser Ser 180 185 190ArgHis Tyr Trp Thr Tyr Pro Gly Ser Leu Thr Thr Pro Pro Leu Ser 195 200205Glu Ser Val Thr Trp Ile Val Leu Arg Glu Pro Ile Ser Ile Ser Glu210 215 220Arg Gln Met Glu Lys Phe Arg Ser Leu Leu Phe Thr Ser GluAsp Asp225 230 235 240Glu Arg Val His Met Val Asn Asn Phe Arg ProPro Gln Pro Leu Arg 245 250 255Gly Arg Val Val Lys Ala Ser Phe ArgAla 260 26550255PRTMus musculus 50Gly Gln Asp Asp Gly Pro Ser AsnTrp His Lys Leu Tyr Pro Ile Ala1 5 10 15Gln Gly Asp Arg Gln Ser ProIle Asn Ile Ile Ser Ser Gln Ala Val 20 25 30Tyr Ser Pro Ser Leu GlnPro Leu Glu Leu Phe Tyr Glu Ala Cys Met 35 40 45Ser Leu Ser Ile ThrAsn Asn Gly His Ser Val Gln Val Asp Phe Asn 50 55 60Asp Ser Asp AspArg Thr Val Val Ser Gly Gly Pro Leu Glu Gly Pro65 70 75 80Tyr ArgLeu Lys Gln Leu His Phe His Trp Gly Lys Lys Arg Asp Met 85 90 95GlySer Glu His Thr Val Asp Gly Lys Ser Phe Pro Ser Glu Leu His 100 105110Leu Val His Trp Asn Ala Lys Lys Tyr Ser Thr Phe Gly Glu Ala Ala115 120 125Ala Ala Pro Asp Gly Leu Ala Val Val Gly Val Phe Leu GluThr Gly 130 135 140Asp Glu His Pro Ser Met Asn Arg Leu Thr Asp AlaLeu Tyr Met Val145 150 155 160Arg Phe Lys Asp Thr Lys Ala Gln PheSer Cys Phe Asn Pro Lys Cys 165 170 175Leu Leu Pro Thr Ser Arg HisTyr Trp Thr Tyr Pro Gly Ser Leu Thr 180 185 190Thr Pro Pro Leu SerGlu Ser Val Thr Trp Ile Val Leu Arg Glu Pro 195 200 205Ile Arg IleSer Glu Arg Gln Met Glu Lys Phe Arg Ser Leu Leu Phe 210 215 220ThrSer Glu Asp Asp Glu Arg Ile His Met Val Asp Asn Phe Arg Pro225 230235 240Pro Gln Pro Leu Lys Gly Arg Val Val Lys Ala Ser Phe Gln Ala245 250 25551264PRTMonodelphis domestica 51Met Thr Gly His His GlyTrp Gly Tyr Gly Gln Glu Asp Gly Pro Ser1 5 10 15Glu Trp His Lys LeuTyr Pro Ile Ala Gln Gly Asp Arg Gln Ser Pro 20 25 30Ile Asp Ile ValSer Ser Gln Ala Val Tyr Asp Pro Thr Leu Lys Pro 35 40 45Leu Val LeuAla Tyr Glu Ser Cys Met Ser Leu Ser Ile Ala Asn Asn 50 55 60Gly HisSer Val Met Val Glu Phe Asp Asp Val Asp Asp Arg Thr Val65 70 7580Val Asn Gly Gly Pro Leu Asp Gly Pro Tyr Arg Leu Lys Gln Phe His85 90 95Phe His Trp Gly Lys Lys His Ser Leu Gly Ser Glu His Thr ValAsp 100 105 110Gly Lys Ser Phe Ser Ser Glu Leu His Leu Val His TrpAsn Gly Lys 115 120 125Lys Tyr Lys Thr Phe Ala Glu Ala Ala Ala AlaPro Asp Gly Leu Ala 130 135 140Val Val Gly Ile Phe Leu Glu Thr GlyAsp Glu His Ala Ser Met Asn145 150 155 160Arg Leu Thr Asp Ala LeuTyr Met Val Arg Phe Lys Gly Thr Lys Ala 165 170 175Gln Phe Asn SerPhe Asn Pro Lys Cys Leu Leu Pro Met Asn Leu Ser 180 185 190Tyr TrpThr Tyr Pro Gly Ser Leu Thr Thr Pro Pro Leu Ser Glu Ser 195 200205Val Thr Trp Ile Val Leu Lys Glu Pro Ile Thr Ile Ser Glu Lys Gln210 215 220Met Glu Lys Phe Arg Ser Leu Leu Phe Thr Ala Glu Glu AspGlu Lys225 230 235 240Val Arg Met Val Asn Asn Phe Arg Pro Pro GlnPro Leu Lys Gly Arg 245 250 255Val Val Gln Ala Ser Phe Arg Ser26052264PRTGallus gallus 52Met Thr Gly His His Ser Trp Gly Tyr GlyGln Asp Asp Gly Pro Ser1 5 10 15Glu Trp His Lys Ser Tyr Pro Ile AlaGln Gly Asn Arg Gln Ser Pro 20 25 30Ile Asp Ile Ile Ser Ala Lys AlaVal Tyr Asp Pro Lys Leu Met Pro 35 40 45Leu Val Ile Ser Tyr Glu SerCys Thr Ser Leu Asn Ile Ser Asn Asn 50 55 60Gly His Ser Val Met ValGlu Phe Glu Asp Ile Asp Asp Lys Thr Val65 70 75 80Ile Ser Gly GlyPro Phe Glu Ser Pro Phe Arg Leu Lys Gln Phe His 85 90 95Phe His TrpGly Ala Lys His Ser Glu Gly Ser Glu His Thr Ile Asp 100 105 110GlyLys Pro Phe Pro Cys Glu Leu His Leu Val His Trp Asn Ala Lys 115 120125Lys Tyr Ala Thr Phe Gly Glu Ala Ala Ala Ala Pro Asp Gly Leu Ala130 135 140Val Val Gly Val Phe Leu Glu Ile Gly Lys Glu His Ala AsnMet Asn145 150 155 160Arg Leu Thr Asp Ala Leu Tyr Met Val Lys PheLys Gly Thr Lys Ala 165 170 175Gln Phe Arg Ser Phe Asn Pro Lys CysLeu Leu Pro Leu Ser Leu Asp 180 185 190Tyr Trp Thr Tyr Leu Gly SerLeu Thr Thr Pro Pro Leu Asn Glu Ser 195 200 205Val Ile Trp Val ValLeu Lys Glu Pro Ile Ser Ile Ser Glu Lys Gln 210 215 220Leu Glu LysPhe Arg Met Leu Leu Phe Thr Ser Glu Glu Asp Gln Lys225 230 235240Val Gln Met Val Asn Asn Phe Arg Pro Pro Gln Pro Leu Lys Gly Arg245 250 255Thr Val Arg Ala Ser Phe Lys Ala 26053264PRTTaeniopygiaguttata 53Met Thr Gly Gln His Ser Trp Gly Tyr Gly Gln Ala Asp GlyPro Ser1 5 10 15Glu Trp His Lys Ala Tyr Pro Ile Ala Gln Gly Asn ArgGln Ser Pro 20 25 30Ile Asp Ile Asp Ser Ala Arg Ala Val Tyr Asp ProSer Leu Gln Pro 35 40 45Leu Leu Ile Ser Tyr Glu Ser Cys Ser Ser LeuSer Ile Ser Asn Thr 50 55 60Gly His Ser Val Met Val Glu Phe Glu AspThr Asp Asp Arg Thr Ala65 70 75 80Ile Ser Gly Gly Pro Phe Gln AsnPro Phe Arg Leu Lys Gln Phe His 85 90 95Phe His Trp Gly Thr Thr HisSer Gln Gly Ser Glu His Thr Ile Asp 100 105 110Gly Lys Pro Phe ProCys Glu Leu His Leu Val His Trp Asn Ala Arg 115 120 125Lys Tyr ThrThr Phe Gly Glu Ala Ala Ala Ala Pro Asp Gly Leu Ala 130 135 140ValVal Gly Val Phe Leu Glu Ile Gly Lys Glu His Ala Ser Met Asn145 150155 160Arg Leu Thr Asp Ala Leu Tyr Met Val Lys Phe Lys Gly Thr LysAla 165 170 175Gln Phe Arg Gly Phe Asn Pro Lys Cys Leu Leu Pro LeuSer Leu Asp 180 185 190Tyr Trp Thr Tyr Leu Gly Ser Leu Thr Thr ProPro Leu Asn Glu Ser 195 200 205Val Thr Trp Ile Val Leu Lys Glu ProIle Arg Ile Ser Val Lys Gln 210 215 220Leu Glu Lys Phe Arg Met LeuLeu Phe Thr Gly Glu Glu Asp Gln Arg225 230 235 240Ile Gln Met AlaAsn Asn Phe Arg Pro Pro Gln Pro Leu Lys Gly Arg 245 250 255Ile ValArg Ala Ser Phe Lys Ala 26054262PRTHomo sapiens 54Met Ser Arg LeuSer Trp Gly Tyr Arg Glu His Asn Gly Pro Ile His1 5 10 15Trp Lys GluPhe Phe Pro Ile Ala Asp Gly Asp Gln Gln Ser Pro Ile 20 25 30Glu IleLys Thr Lys Glu Val Lys Tyr Asp Ser Ser Leu Arg Pro Leu 35 40 45SerIle Lys Tyr Asp Pro Ser Ser Ala Lys Ile Ile Ser Asn Ser Gly 50 5560His Ser Phe Asn Val Asp Phe Asp Asp Thr Glu Asn Lys Ser Val Leu6570 75 80Arg Gly Gly Pro Leu Thr Gly Ser Tyr Arg Leu Arg Gln Val HisLeu 85 90 95His Trp Gly Ser Ala Asp Asp His Gly Ser Glu His Ile ValAsp Gly 100 105 110Val Ser Tyr Ala Ala Glu Leu His Val Val His TrpAsn Ser Asp Lys 115 120 125Tyr Pro Ser Phe Val Glu Ala Ala His GluPro Asp Gly Leu Ala Val 130 135 140Leu Gly Val Phe Leu Gln Ile GlyGlu Pro Asn Ser Gln Leu Gln Lys145 150 155 160Ile Thr Asp Thr LeuAsp Ser Ile Lys Glu Lys Gly Lys Gln Thr Arg 165 170 175Phe Thr AsnPhe Asp Leu Leu Ser Leu Leu Pro Pro Ser Trp Asp Tyr 180 185 190TrpThr Tyr Pro Gly Ser Leu Thr Val Pro Pro Leu Leu Glu Ser Val 195 200205Thr Trp Ile Val Leu Lys Gln Pro Ile Asn Ile Ser Ser Gln Gln Leu210 215 220Ala Lys Phe Arg Ser Leu Leu Cys Thr Ala Glu Gly Glu AlaAla Ala225 230 235 240Phe Leu Val Ser Asn His Arg Pro Pro Gln ProLeu Lys Gly Arg Lys 245 250 255Val Arg Ala Ser Phe His26055262PRTPan troglodytes 55Met Ser Arg Leu Ser Trp Gly Tyr ArgGlu His Asn Gly Pro Ile His1 5 10 15Trp Lys Glu Phe Phe Pro Ile AlaAsp Gly Asp Gln Gln Ser Pro Ile 20 25 30Glu Ile Lys Thr Lys Glu ValLys Tyr Asp Ser Ser Leu Arg Pro Leu 35 40 45Ser Ile Lys Tyr Asp ProSer Ser Ala Lys Ile Ile Ser Asn Ser Gly 50 55 60His Ser Phe Asn ValAsp Phe Asp Asp Thr Glu Asn Lys Ser Val Leu65 70 75 80Arg Gly GlyPro Leu Thr Gly Ser Tyr Arg Leu Arg Gln Phe His Leu 85 90 95His TrpGly Ser Ala Asp Asp His Gly Ser Glu His Ile Val Asp Gly 100 105110Val Ser Tyr Ala Ala Glu Leu His Val Val His Trp Asn Ser Asp Lys115 120 125Tyr Pro Ser Phe Val Glu Ala Ala His Glu Pro Asp Gly LeuAla Val 130 135 140Leu Gly Val Phe Leu Gln Ile Gly Glu Pro Asn SerGln Leu Gln Lys145 150 155 160Ile Thr Asp Thr Leu Asp Ser Ile LysGlu Lys Gly Lys Gln Thr Arg 165 170 175Phe Thr Asn Phe Asp Pro LeuSer Leu Leu Pro Pro Ser Trp Asp Tyr 180 185 190Trp Thr Tyr Pro GlySer Leu Thr Val Pro Pro Leu Leu Glu Ser Val 195 200 205Thr Trp IleVal Leu Lys Gln Pro Ile Asn Ile Ser Ser Gln Gln Leu 210 215 220AlaLys Phe Arg Ser Leu Leu Cys Thr Ala Glu Gly Glu Ala Ala Ala225 230235 240Phe Leu Val Ser Asn His Arg Pro Pro Gln Pro Leu Lys Gly ArgLys 245 250 255Val Arg Ala Ser Phe His 26056262PRTMacaca mulatta56Met Ser Arg Leu Ser Trp Gly Tyr Arg Glu His Asn Gly Pro Ile His15 10 15Trp Lys Glu Phe Phe Pro Ile Ala Asp Gly Asp Gln Gln Ser ProIle 20 25 30Glu Ile Lys Thr Gln Glu Val Lys Tyr Asp Ser Ser Leu ArgPro Leu 35 40 45Ser Ile Lys Tyr Asp Pro Ser Ser Ala Lys Ile Ile SerAsn Ser Gly 50 55 60His Ser Phe Asn Val Asp Phe Asp Asp Thr Glu AspLys Ser Val Leu65 70 75 80Arg Gly Gly Pro Leu Ala Gly Ser Tyr ArgLeu Arg Gln Phe His Leu 85 90 95His Trp Gly Ser Ala Asp Asp His GlySer Glu His Ile Val Asp Gly 100 105 110Val Ser Tyr Ala Ala Glu LeuHis Val Val His Trp Asn Ser Asp Lys 115 120 125Tyr Pro Ser Phe ValGlu Ala Ala His Glu Pro Asp Gly Leu Ala Val 130 135 140Leu Gly ValPhe Leu Gln Ile Gly Glu Pro Asn Ser Gln Leu Gln Lys145 150 155160Ile Thr Asp Ile Leu Asp Ser Ile Lys Glu Lys Gly Lys Gln Thr Arg165 170 175Phe Thr Asn Phe Asp Pro Leu Ser Leu Leu Pro Pro Ser TrpAsp Tyr 180 185 190Trp Thr Tyr Pro Gly Ser Leu Thr Val Pro Pro LeuLeu Glu Ser Val 195 200 205Ile Trp Ile Val Leu Lys Gln Pro Ile AsnVal Ser Ser Gln Gln Leu 210 215 220Ala Lys Phe Arg Ser Leu Leu CysThr Ala Glu Gly Glu Ala Ala Ala225 230 235 240Phe Leu Leu Ser AsnHis Arg Pro Pro Gln Pro Leu Lys Gly Arg Lys 245 250 255Val Arg AlaSer Phe Arg 26057262PRTOryctolagus cuniculus 57Met Ser Arg Ile SerTrp Gly Tyr Gly Glu His Asn Gly Pro Ile His1 5
10 15Trp Asn Gln Phe Phe Pro Ile Ala Asp Gly Asp Gln Gln Ser ProIle 20 25 30Glu Ile Lys Thr Lys Glu Val Lys Tyr Asp Ser Ser Leu ArgPro Leu 35 40 45Ser Ile Lys Tyr Asp Pro Ser Ser Ala Lys Ile Ile SerAsn Ser Gly 50 55 60His Ser Phe Asn Val Asp Phe Asp Asp Thr Glu AspLys Ser Val Leu65 70 75 80Arg Gly Gly Pro Leu Thr Gly Asn Tyr ArgLeu Arg Gln Phe His Leu 85 90 95His Trp Gly Ser Ala Asp Asp His GlySer Glu His Val Val Asp Gly 100 105 110Val Arg Tyr Ala Ala Glu LeuHis Val Val His Trp Asn Ser Asp Lys 115 120 125Tyr Pro Ser Phe ValGlu Ala Ala His Glu Pro Asp Gly Leu Ala Val 130 135 140Leu Gly ValPhe Leu Gln Ile Gly Glu Tyr Asn Ser Gln Leu Gln Lys145 150 155160Ile Thr Asp Ile Leu Asp Ser Ile Lys Glu Lys Gly Lys Gln Thr Arg165 170 175Phe Thr Asn Phe Asp Pro Leu Ser Leu Leu Pro Ser Ser TrpAsp Tyr 180 185 190Trp Thr Tyr Pro Gly Ser Leu Thr Val Pro Pro LeuLeu Glu Ser Val 195 200 205Thr Trp Ile Val Leu Lys Gln Pro Ile AsnIle Ser Ser Gln Gln Leu 210 215 220Ala Lys Phe Arg Ser Leu Leu CysSer Ala Glu Gly Glu Ser Ala Ala225 230 235 240Phe Leu Leu Ser AsnHis Arg Pro Pro Gln Pro Leu Lys Gly Arg Lys 245 250 255Val Arg AlaSer Phe His 26058262PRTAiluropoda melanoleuca 58Met Ser Arg Leu SerTrp Gly Tyr Gly Glu His Asn Gly Pro Ile His1 5 10 15Trp Asn Lys PhePhe Pro Ile Ala Asp Gly Asp Gln Gln Ser Pro Ile 20 25 30Glu Ile LysThr Lys Glu Val Lys Tyr Asp Ser Ser Leu Arg Pro Leu 35 40 45Ser IleLys Tyr Asp Ala Asn Ser Ala Lys Ile Ile Ser Asn Ser Gly 50 55 60HisSer Phe Ser Val Asp Phe Asp Asp Thr Glu Asp Lys Ser Val Leu65 70 7580Arg Gly Gly Pro Leu Thr Gly Ser Tyr Arg Leu Arg Gln Phe His Leu85 90 95His Trp Gly Ser Ala Asp Asp His Gly Ser Glu His Val Val AspGly 100 105 110Val Arg Tyr Ala Ala Glu Leu His Val Val His Trp AsnSer Asp Lys 115 120 125Tyr Pro Ser Phe Val Glu Ala Ala His Glu ProAsp Gly Leu Ala Val 130 135 140Leu Gly Val Phe Leu Gln Ile Gly GluHis Asn Ser Gln Leu Gln Lys145 150 155 160Ile Thr Asp Ile Leu AspSer Ile Lys Glu Lys Gly Lys Gln Thr Arg 165 170 175Phe Thr Asn PheAsp Pro Leu Ser Leu Leu Pro Pro Ser Trp Asp Tyr 180 185 190Trp ThrTyr Pro Gly Ser Leu Thr Val Pro Pro Leu Leu Glu Ser Val 195 200205Thr Trp Ile Val Leu Lys Gln Pro Ile Asn Ile Ser Ser Glu Gln Leu210 215 220Ala Thr Phe Arg Thr Leu Leu Cys Thr Ala Glu Gly Glu AlaAla Ala225 230 235 240Phe Leu Leu Ser Asn His Arg Pro Pro Gln ProLeu Lys Gly Arg Lys 245 250 255Val Arg Ala Ser Phe His26059262PRTSus scrofa 59Met Ser Arg Phe Ser Trp Gly Tyr Gly Glu HisAsn Gly Pro Val His1 5 10 15Trp Asn Glu Phe Phe Pro Ile Ala Asp GlyAsp Gln Gln Ser Pro Ile 20 25 30Glu Ile Lys Thr Lys Glu Val Lys TyrAsp Ser Ser Leu Arg Pro Leu 35 40 45Ser Ile Lys Tyr Asp Pro Ser SerAla Lys Ile Ile Ser Asn Ser Gly 50 55 60His Ser Phe Ser Val Asp PheAsp Asp Thr Glu Asp Lys Ser Val Leu65 70 75 80Arg Gly Gly Pro LeuThr Gly Ser Tyr Arg Leu Arg Gln Phe His Leu 85 90 95His Trp Gly SerAla Asp Asp His Gly Ser Glu His Val Val Asp Gly 100 105 110Val LysTyr Ala Ala Glu Leu His Val Val His Trp Asn Ser Asp Lys 115 120125Tyr Pro Ser Phe Val Glu Ala Ala His Glu Pro Asp Gly Leu Ala Val130 135 140Leu Gly Val Phe Leu Gln Ile Gly Glu His Asn Ser Gln LeuGln Lys145 150 155 160Ile Thr Asp Ile Leu Asp Ser Ile Lys Glu LysGly Lys Gln Thr Arg 165 170 175Phe Thr Asn Phe Asp Pro Leu Ser LeuLeu Pro Pro Ser Trp Asp Tyr 180 185 190Trp Thr Tyr Pro Gly Ser LeuThr Val Pro Pro Leu Leu Glu Ser Val 195 200 205Thr Trp Ile Ile LeuLys Gln Pro Ile Asn Ile Ser Ser Gln Gln Leu 210 215 220Ala Thr PheArg Thr Leu Leu Cys Thr Lys Glu Gly Glu Glu Ala Ala225 230 235240Phe Leu Leu Ser Asn His Arg Pro Leu Gln Pro Leu Lys Gly Arg Lys245 250 255Val Arg Ala Ser Phe His 26060262PRTCallithrix jacchus60Met Ser Arg Leu Ser Trp Gly Tyr Gly Glu His Asn Gly Pro Ile His15 10 15Trp Asn Glu Phe Phe Pro Ile Ala Asp Gly Asp Arg Gln Ser ProIle 20 25 30Glu Ile Lys Ala Lys Glu Val Lys Tyr Asp Ser Ser Leu ArgPro Leu 35 40 45Ser Ile Lys Tyr Asp Pro Ser Ser Ala Lys Ile Ile SerAsn Ser Gly 50 55 60His Ser Phe Asn Val Asp Phe Asp Asp Thr Glu AspLys Ser Val Leu65 70 75 80His Gly Gly Pro Leu Thr Gly Ser Tyr ArgLeu Arg Gln Phe His Leu 85 90 95His Trp Gly Ser Ala Asp Asp His GlySer Glu His Val Val Asp Gly 100 105 110Val Arg Tyr Ala Ala Glu LeuHis Val Val His Trp Asn Ser Glu Lys 115 120 125Tyr Pro Ser Phe ValGlu Ala Ala His Glu Pro Asp Gly Leu Ala Val 130 135 140Leu Gly ValPhe Leu Gln Ile Gly Glu Pro Asn Ser Gln Leu Gln Lys145 150 155160Ile Ile Asp Ile Leu Asp Ser Ile Lys Glu Lys Gly Lys Gln Ile Arg165 170 175Phe Thr Asn Phe Asp Pro Leu Ser Leu Phe Pro Pro Ser TrpAsp Tyr 180 185 190Trp Thr Tyr Ser Gly Ser Leu Thr Val Pro Pro LeuLeu Glu Ser Val 195 200 205Thr Trp Ile Leu Leu Lys Gln Pro Ile AsnIle Ser Ser Gln Gln Leu 210 215 220Ala Lys Phe Arg Ser Leu Leu CysThr Ala Glu Gly Glu Ala Ala Ala225 230 235 240Phe Leu Leu Ser AsnTyr Arg Pro Pro Gln Pro Leu Lys Gly Arg Lys 245 250 255Val Arg AlaSer Phe Arg 26061262PRTRattus norvegicus 61Met Ala Arg Leu Ser TrpGly Tyr Asp Glu His Asn Gly Pro Ile His1 5 10 15Trp Asn Glu Leu PhePro Ile Ala Asp Gly Asp Gln Gln Ser Pro Ile 20 25 30Glu Ile Lys ThrLys Glu Val Lys Tyr Asp Ser Ser Leu Arg Pro Leu 35 40 45Ser Ile LysTyr Asp Pro Ala Ser Ala Lys Ile Ile Ser Asn Ser Gly 50 55 60His SerPhe Asn Val Asp Phe Asp Asp Thr Glu Asp Lys Ser Val Leu65 70 7580Arg Gly Gly Pro Leu Thr Gly Ser Tyr Arg Leu Arg Gln Phe His Leu85 90 95His Trp Gly Ser Ala Asp Asp His Gly Ser Glu His Val Val AspGly 100 105 110Val Arg Tyr Ala Ala Glu Leu His Val Val His Trp AsnSer Asp Lys 115 120 125Tyr Pro Ser Phe Val Glu Ala Ala His Glu SerAsp Gly Leu Ala Val 130 135 140Leu Gly Val Phe Leu Gln Ile Gly GluHis Asn Pro Gln Leu Gln Lys145 150 155 160Ile Thr Asp Ile Leu AspSer Ile Lys Glu Lys Gly Lys Gln Thr Arg 165 170 175Phe Thr Asn PheAsp Pro Leu Cys Leu Leu Pro Ser Ser Trp Asp Tyr 180 185 190Trp ThrTyr Pro Gly Ser Leu Thr Val Pro Pro Leu Leu Glu Ser Val 195 200205Thr Trp Ile Val Leu Lys Gln Pro Ile Ser Ile Ser Ser Gln Gln Leu210 215 220Ala Arg Phe Arg Ser Leu Leu Cys Thr Ala Glu Gly Glu SerAla Ala225 230 235 240Phe Leu Leu Ser Asn His Arg Pro Pro Gln ProLeu Lys Gly Arg Arg 245 250 255Val Arg Ala Ser Phe Tyr26062262PRTMus musculus 62Met Ala Arg Leu Ser Trp Gly Tyr Gly GluHis Asn Gly Pro Ile His1 5 10 15Trp Asn Glu Leu Phe Pro Ile Ala AspGly Asp Gln Gln Ser Pro Ile 20 25 30Glu Ile Lys Thr Lys Glu Val LysTyr Asp Ser Ser Leu Arg Pro Leu 35 40 45Ser Ile Lys Tyr Asp Pro AlaSer Ala Lys Ile Ile Ser Asn Ser Gly 50 55 60His Ser Phe Asn Val AspPhe Asp Asp Thr Glu Asp Lys Ser Val Leu65 70 75 80Arg Gly Gly ProLeu Thr Gly Asn Tyr Arg Leu Arg Gln Phe His Leu 85 90 95His Trp GlySer Ala Asp Asp His Gly Ser Glu His Val Val Asp Gly 100 105 110ValArg Tyr Ala Ala Glu Leu His Val Val His Trp Asn Ser Asp Lys 115 120125Tyr Pro Ser Phe Val Glu Ala Ala His Glu Ser Asp Gly Leu Ala Val130 135 140Leu Gly Val Phe Leu Gln Ile Gly Glu His Asn Pro Gln LeuGln Lys145 150 155 160Ile Thr Asp Ile Leu Asp Ser Ile Lys Glu LysGly Lys Gln Thr Arg 165 170 175Phe Thr Asn Phe Asp Pro Leu Cys LeuLeu Pro Ser Ser Trp Asp Tyr 180 185 190Trp Thr Tyr Pro Gly Ser LeuThr Val Pro Pro Leu Leu Glu Ser Val 195 200 205Thr Trp Ile Val LeuLys Gln Pro Ile Ser Ile Ser Ser Gln Gln Leu 210 215 220Ala Arg PheArg Ser Leu Leu Cys Thr Ala Glu Gly Glu Ser Ala Ala225 230 235240Phe Leu Leu Ser Asn His Arg Pro Pro Gln Pro Leu Lys Gly Arg Arg245 250 255Val Arg Ala Ser Phe Tyr 26063262PRTCanis familiaris63Met Ser Arg Leu Ser Trp Gly Tyr Gly Glu His Asn Gly Pro Ile His15 10 15Trp Asn Lys Phe Phe Pro Ile Ala Asp Gly Asp Gln Gln Ser ProIle 20 25 30Glu Ile Lys Thr Lys Glu Val Lys Tyr Asp Ser Ser Leu ArgPro Leu 35 40 45Ser Ile Lys Tyr Asp Ala Asn Ser Ala Lys Ile Ile SerAsn Ser Gly 50 55 60His Ser Phe Ser Val Asp Phe Asp Asp Thr Glu AspLys Ser Val Leu65 70 75 80Arg Gly Gly Pro Leu Thr Gly Ser Tyr ArgLeu Arg Gln Phe His Leu 85 90 95His Trp Gly Ser Ala Asp Asp His GlySer Glu His Val Val Asp Gly 100 105 110Val Arg Tyr Ala Ala Glu LeuHis Val Val His Trp Asn Ser Asp Lys 115 120 125Tyr Pro Ser Phe ValGlu Ala Ala His Glu Pro Asp Gly Leu Ala Val 130 135 140Leu Gly ValPhe Leu Gln Ile Gly Glu His Asn Ser Gln Leu Gln Lys145 150 155160Ile Thr Asp Ile Leu Asp Ser Ile Lys Glu Lys Gly Lys Gln Thr Arg165 170 175Phe Thr Asn Phe Asp Pro Leu Ser Leu Leu Pro Pro Ser TrpAsp Tyr 180 185 190Trp Thr Tyr Pro Gly Ser Leu Thr Val Pro Pro LeuLeu Glu Ser Val 195 200 205Thr Trp Ile Val Leu Lys Gln Pro Ile AsnIle Ser Ser Gln Gln Leu 210 215 220Ala Thr Phe Arg Thr Leu Leu CysThr Ala Glu Gly Glu Ala Ala Ala225 230 235 240Phe Leu Leu Ser AsnHis Arg Pro Pro Gln Pro Leu Lys Gly Arg Lys 245 250 255Val Arg AlaSer Phe His 26064252PRTEquus caballus 64Met Ser Gly Pro Val His TrpAsn Glu Phe Phe Pro Ile Ala Asp Gly1 5 10 15Asp Gln Gln Ser Pro IleGlu Ile Lys Thr Lys Glu Val Lys Tyr Asp 20 25 30Ser Ser Leu Arg ProLeu Thr Ile Lys Tyr Asp Pro Ser Ser Ala Lys 35 40 45Ile Ile Ser AsnSer Gly His Ser Phe Ser Val Gly Phe Asp Asp Thr 50 55 60Glu Asn LysSer Val Leu Arg Gly Gly Pro Leu Thr Gly Ser Tyr Arg65 70 75 80LeuArg Gln Phe His Leu His Trp Gly Ser Ala Asp Asp His Gly Ser 85 9095Glu His Val Val Asp Gly Val Arg Tyr Ala Ala Glu Leu His Ile Val100 105 110His Trp Asn Ser Asp Lys Tyr Pro Ser Phe Val Glu Ala AlaHis Glu 115 120 125Pro Asp Gly Leu Ala Val Leu Gly Val Phe Leu GlnVal Gly Glu His 130 135 140Asn Ser Gln Leu Gln Lys Ile Thr Asp ThrLeu Asp Ser Ile Lys Glu145 150 155 160Lys Gly Lys Gln Thr Leu PheThr Asn Phe Asp Pro Leu Ser Leu Leu 165 170 175Pro Pro Ser Trp AspTyr Trp Thr Tyr Pro Gly Ser Leu Thr Val Pro 180 185 190Pro Leu LeuGlu Ser Val Thr Trp Ile Ile Leu Lys Gln Pro Ile Asn 195 200 205IleSer Ser Gln Gln Leu Val Lys Phe Arg Thr Leu Leu Cys Thr Ala 210 215220Glu Gly Glu Thr Ala Ala Phe Leu Leu Ser Asn His Arg Pro ProGln225 230 235 240Pro Leu Lys Gly Arg Lys Val Arg Ala Ser Phe Arg245 25065262PRTBos taurus 65Met Ser Gly Phe Ser Trp Gly Tyr Gly GluArg Asp Gly Pro Val His1 5 10 15Trp Asn Glu Phe Phe Pro Ile Ala AspGly Asp Gln Gln Ser Pro Ile 20 25 30Glu Ile Lys Thr Lys Glu Val ArgTyr Asp Ser Ser Leu Arg Pro Leu 35 40 45Gly Ile Lys Tyr Asp Ala SerSer Ala Lys Ile Ile Ser Asn Ser Gly 50 55 60His Ser Phe Asn Val AspPhe Asp Asp Thr Asp Asp Lys Ser Val Leu65 70 75 80Arg Gly Gly ProLeu Thr Gly Ser Tyr Arg Leu Arg Gln Phe His Leu 85 90 95His Trp GlySer Thr Asp Asp His Gly Ser Glu His Val Val Asp Gly 100 105 110ValArg Tyr Ala Ala Glu Leu His Val Val His Trp Asn Ser Asp Lys 115 120125Tyr Pro Ser Phe Val Glu Ala Ala His Glu Pro Asp Gly Leu Ala Val130 135 140Leu Gly Ile Phe Leu Gln Ile Gly Glu His Asn Pro Gln LeuGln Lys145 150 155 160Ile Thr Asp Ile Leu Asp Ser Ile Lys Glu LysGly Lys Gln Thr Arg 165 170 175Phe Thr Asn Phe Asp Pro Val Cys LeuLeu Pro Pro Cys Arg Asp Tyr 180 185 190Trp Thr Tyr Pro Gly Ser LeuThr Val Pro Pro Leu Leu Glu Ser Val 195 200 205Thr Trp Ile Ile LeuLys Gln Pro Ile Asn Ile Ser Ser Gln Gln Leu 210 215 220Ala Ala PheArg Thr Leu Leu Cys Ser Arg Glu Gly Glu Thr Ala Ala225 230 235240Phe Leu Leu Ser Asn His Arg Pro Pro Gln Pro Leu Lys Gly Arg Lys245 250 255Val Arg Ala Ser Phe Arg 26066419PRTMonodelphis domestica66Met Ala Ser Val Phe Ala Gly Trp Gly Pro Gly Arg Thr His Leu Phe15 10 15Phe Arg Phe Phe Pro Gly Pro Phe Ser Ala Leu Pro Ala Gln ThrSer 20 25 30Arg Gly Val Leu Val Phe Thr Ala Pro Gly Pro Ser Pro ArgArg Val 35 40 45Pro Asp Pro Val His Pro Gly Arg Asp Val Val Arg ProSer Gly Ser 50 55 60Leu Phe Ser Cys Arg Leu Pro Pro Pro Arg Pro SerAla Pro Ala Arg65 70 75 80Glu Arg Arg Pro Leu Ala Glu Lys Val GlyArg Ser Ser Ala Pro His 85 90 95Leu Pro Leu Asp Asn Phe Glu Phe IleAla Lys Arg Leu Arg Arg Arg 100
105 110Val Leu Ser Gly Leu Ala Ala Glu Ser Ala Gly Ala Leu Ala ProSer 115 120 125Leu Pro Arg Ser Leu His Ser Ser Leu Gly Leu Arg SerSer Leu Lys 130 135 140Ser Gln Arg Val Phe Pro Ser Pro His Ser GluGlu Thr Met Ser Arg145 150 155 160Leu Ser Trp Gly Tyr Cys Glu HisAsn Gly Pro Val His Trp Ser Glu 165 170 175Leu Phe Pro Ile Ala AspGly Asp Tyr Gln Ser Pro Ile Glu Ile Asn 180 185 190Thr Lys Glu ValLys Tyr Asp Ser Ser Leu Arg Pro Leu Ser Ile Lys 195 200 205Tyr AspPro Ala Ser Ala Lys Ile Ile Ser Asn Ser Gly His Ser Phe 210 215220Ser Val Asp Phe Asp Asp Ser Glu Asp Lys Ser Val Leu Arg GlyGly225 230 235 240Pro Leu Ile Gly Thr Tyr Arg Leu Arg Gln Phe HisLeu His Trp Gly 245 250 255Ser Thr Asp Asp Gln Gly Ser Glu His ThrVal Asp Gly Met Lys Tyr 260 265 270Ala Ala Glu Leu His Val Val HisTrp Asn Ser Asp Lys Tyr Pro Ser 275 280 285Phe Val Glu Ala Ala HisGlu Pro Asp Gly Leu Ala Val Leu Gly Ile 290 295 300Phe Leu Gln ThrGly Glu His Asn Leu Gln Met Gln Lys Ile Thr Asp305 310 315 320IleLeu Asp Ser Ile Lys Glu Lys Gly Lys Gln Ile Arg Phe Thr Asn 325 330335Phe Asp Pro Ala Thr Leu Leu Pro Gln Ser Trp Asp Tyr Trp Thr Tyr340 345 350Pro Gly Ser Leu Thr Val Pro Pro Leu Leu Glu Ser Val ThrTrp Ile 355 360 365Val Leu Lys Gln Pro Ile Thr Ile Ser Ser Gln GlnLeu Ala Lys Phe 370 375 380Arg Ser Leu Leu Tyr Thr Gly Glu Gly GluAla Ala Ala Phe Leu Leu385 390 395 400Ser Asn Tyr Arg Pro Pro GlnPro Leu Lys Gly Arg Lys Val Arg Ala 405 410 415Ser PheArg67428PRTOrnithorhynchus anatinus 67Met Lys Lys Gly Val Gly SerPhe Tyr Glu Leu Ala Val Asn Arg Trp1 5 10 15Ser Val Val Asn Arg ValGln Ile Met Ile Val Glu Ser Ile Thr Glu 20 25 30Pro Leu Leu Cys GlySer Ala Leu Ala Val Ala Pro Ala Leu Ala Leu 35 40 45Ala Val Val GlnAla Leu Ala Leu Thr Val Val Gln Ala Leu Ala Leu 50 55 60Ala Val SerPro Ala Leu Ala Leu Ser Val Ala Pro Ala Leu Ala Leu65 70 75 80AlaVal Val Gln Ala Leu Ala Leu Ala Val Val Gln Ala Leu Ala Leu 85 9095Ala Val Ala Gln Ala Leu Ala Leu Ala Val Ala Gln Ala Leu Ala Leu100 105 110Ala Val Ala Gln Ala Leu Ala Leu Ala Leu Pro Gln Ala LeuAla Leu 115 120 125Thr Leu Pro Gln Ala Leu Ala Leu Thr Leu Ser ProThr Leu Ala Leu 130 135 140Ser Val Ala Pro Ala Leu Ala Leu Ala ValAla Pro Ala Leu Ala Leu145 150 155 160Ala Asp Ser Pro Ala Leu AlaLeu Ala Leu Ala Arg Pro His Pro Ser 165 170 175Ser Gly Pro Ile HisTrp Asn Glu Leu Phe Pro Ile Ala Asp Gly Asp 180 185 190Arg Gln SerPro Ile Glu Ile Lys Thr Lys Glu Val Lys Tyr Asp Ser 195 200 205SerLeu Arg Pro Leu Ser Ile Lys Tyr Asp Pro Thr Ser Ala Lys Ile 210 215220Ile Ser Asn Ser Gly His Ser Phe Ser Val Asp Phe Asp Asp ThrGlu225 230 235 240Asp Lys Ser Val Leu Arg Gly Gly Pro Leu Ser GlyThr Tyr Arg Leu 245 250 255Arg Gln Phe His Phe His Trp Gly Ser AlaAsp Asp His Gly Ser Glu 260 265 270His Thr Val Asp Gly Met Glu TyrSer Ala Glu Leu His Val Val His 275 280 285Trp Asn Ser Asp Lys TyrSer Ser Phe Val Glu Ala Ala His Glu Pro 290 295 300Asp Gly Leu AlaVal Leu Gly Ile Phe Leu Lys Arg Gly Glu His Asn305 310 315 320LeuGln Leu Gln Lys Ile Thr Asp Ile Leu Asp Ala Ile Lys Glu Lys 325 330335Gly Lys Gln Met Arg Phe Thr Asn Phe Asp Pro Leu Ser Leu Leu Pro340 345 350Leu Thr Arg Asp Tyr Trp Thr Tyr Pro Gly Ser Leu Thr ValPro Pro 355 360 365Leu Leu Glu Ser Val Ile Trp Ile Ile Phe Lys GlnPro Ile Ser Ile 370 375 380Ser Ser Gln Gln Leu Ala Lys Phe Arg AsnLeu Leu Tyr Thr Ala Glu385 390 395 400Gly Glu Ala Ala Asp Phe MetLeu Ser Asn His Arg Pro Pro Gln Pro 405 410 415Leu Lys Gly Arg LysVal Arg Ala Ser Phe Arg Ser 420 425681082PRTChlamydomonasreinhardtii 68Met Leu Pro Gly Leu Gly Val Ile Leu Leu Val Leu ProMet Gln Tyr1 5 10 15Tyr Phe Gly Tyr Lys Ile Val Gln Ile Lys Leu GlnAsn Ala Lys His 20 25 30Val Ala Leu Arg Ser Ala Ile Met Gln Glu ValLeu Pro Ala Ile Lys 35 40 45Leu Val Lys Tyr Tyr Ala Trp Glu Gln PhePhe Glu Asn Gln Ile Ser 50 55 60Lys Val Arg Arg Glu Glu Ile Arg LeuAsn Phe Trp Asn Cys Val Met65 70 75 80Lys Val Ile Asn Val Ala CysVal Phe Cys Val Pro Pro Met Thr Ala 85 90 95Phe Val Ile Phe Thr ThrTyr Glu Phe Gln Arg Ala Arg Leu Val Ser 100 105 110Ser Val Ala PheThr Thr Leu Ser Leu Phe Asn Ile Leu Arg Phe Pro 115 120 125Leu ValVal Leu Pro Lys Ala Leu Arg Ala Val Ser Glu Ala Asn Ala 130 135140Ser Leu Gln Arg Leu Glu Ala Tyr Leu Leu Glu Glu Val Pro SerGly145 150 155 160Thr Ala Ala Val Lys Thr Pro Lys Asn Ala Pro ProGly Ala Val Ile 165 170 175Glu Asn Gly Val Phe His His Pro Ser AsnPro Asn Trp His Leu His 180 185 190Val Pro Lys Phe Glu Val Lys ProGly Gln Val Val Ala Val Val Gly 195 200 205Arg Ile Ala Ala Gly LysSer Ser Leu Val Gln Ala Ile Leu Gly Asn 210 215 220Met Val Lys GluHis Gly Ser Phe Asn Val Gly Gly Arg Ile Ser Tyr225 230 235 240ValPro Gln Asn Pro Trp Leu Gln Asn Leu Ser Leu Arg Asp Asn Val 245 250255Leu Phe Gly Glu Gln Phe Asp Glu Asn Lys Tyr Thr Asp Val Ile Glu260 265 270Ser Cys Ala Leu Thr Leu Asp Leu Gln Ile Leu Ser Asn GlyAsp Gln 275 280 285Ser Lys Ala Gly Ile Arg Gly Val Asn Phe Ser GlyGly Gln Arg Gln 290 295 300Arg Val Asn Leu Ala Arg Cys Ala Tyr AlaAsp Ala Asp Leu Val Leu305 310 315 320Leu Asp Asn Ala Leu Ser AlaVal Asp His His Thr Ala His His Ile 325 330 335Phe Asp Lys Cys IleLys Gly Leu Phe Ser Asp Lys Ala Val Val Leu 340 345 350Val Thr HisGln Ile Glu Phe Met Pro Arg Cys Asp Asn Val Ala Ile 355 360 365MetAsp Glu Gly Arg Cys Leu Tyr Phe Gly Lys Trp Asn Glu Glu Ala 370 375380Gln His Leu Leu Gly Lys Leu Leu Pro Ile Thr His Leu Leu HisAla385 390 395 400Ala Gly Ser Gln Glu Ala Pro Pro Ala Pro Lys LysLys Ala Glu Asp 405 410 415Lys Ala Gly Pro Gln Lys Ser Gln Ser LeuGln Leu Thr Leu Ala Pro 420 425 430Thr Ser Ile Gly Lys Pro Thr GluLys Pro Lys Asp Val Gln Lys Leu 435 440 445Thr Ala Tyr Gln Ala AlaLeu Ile Tyr Thr Trp Tyr Gly Asn Leu Phe 450 455 460Leu Val Gly ValCys Phe Phe Phe Phe Leu Ala Ala Gln Cys Ser Arg465 470 475 480GlnIle Ser Asp Phe Trp Val Arg Trp Trp Val Asn Asp Glu Tyr Lys 485 490495Lys Phe Pro Val Lys Gly Glu Gln Asp Ser Ala Ala Thr Thr Phe Tyr500 505 510Cys Leu Ile Tyr Leu Leu Leu Val Gly Leu Phe Tyr Ile PheMet Ile 515 520 525Phe Arg Gly Ala Thr Phe Leu Trp Trp Val Leu LysSer Ser Glu Thr 530 535 540Ile Arg Arg Lys Ala Leu His Asn Val LeuAsn Ala Pro Met Gly Phe545 550 555 560Phe Leu Val Thr Pro Val GlyAsp Leu Leu Leu Asn Phe Thr Lys Asp 565 570 575Gln Asp Ile Met AspGlu Asn Leu Pro Asp Ala Val His Phe Met Gly 580 585 590Ile Tyr GlyLeu Ile Leu Leu Ala Thr Thr Ile Thr Val Ser Val Thr 595 600 605IleAsn Phe Phe Ala Ala Phe Thr Gly Ala Leu Ile Ile Met Thr Leu 610 615620Ile Met Leu Ser Ile Tyr Leu Pro Ala Ala Thr Ala Leu Lys LysAla625 630 635 640Arg Ala Val Ser Gly Gly Met Leu Val Gly Leu ValAla Glu Val Leu 645 650 655Glu Gly Leu Gly Val Val Gln Ala Phe AsnLys Gln Glu Tyr Phe Ile 660 665 670Glu Glu Ala Ala Arg Arg Thr AsnIle Thr Asn Ser Ala Val Phe Asn 675 680 685Ala Glu Ala Leu Asn LeuTrp Leu Ala Phe Trp Cys Asp Phe Ile Gly 690 695 700Ala Cys Leu ValGly Val Val Ser Ala Phe Ala Val Gly Met Ala Lys705 710 715 720AspLeu Gly Gly Ala Thr Val Gly Leu Ala Phe Ser Asn Ile Ile Gln 725 730735Met Leu Val Phe Tyr Thr Trp Val Val Arg Phe Ile Ser Glu Ser Ile740 745 750Ser Leu Phe Asn Ser Val Glu Gly Met Ala Tyr Leu Ala AspTyr Val 755 760 765Pro His Asp Gly Val Phe Tyr Asp Gln Arg Gln LysAsp Gly Val Ala 770 775 780Lys Gln Ile Val Leu Pro Asp Gly Asn IleVal Pro Ala Ala Ser Lys785 790 795 800Val Gln Val Val Val Asp AspAla Ala Leu Ala Arg Trp Pro Ala Thr 805 810 815Gly Asn Ile Arg PheGlu Asp Val Trp Met Gln Tyr Arg Leu Asp Ala 820 825 830Pro Trp AlaLeu Lys Gly Val Thr Phe Lys Ile Asn Asp Gly Glu Lys 835 840 845ValGly Ala Val Gly Arg Thr Gly Ser Gly Lys Ser Thr Thr Leu Leu 850 855860Ala Leu Tyr Arg Met Phe Glu Leu Gly Lys Gly Arg Ile Leu ValAsp865 870 875 880Gly Val Asp Ile Ala Thr Leu Ser Leu Lys Arg LeuArg Thr Gly Leu 885 890 895Ser Ile Ile Pro Gln Glu Pro Val Met PheThr Gly Thr Val Arg Ser 900 905 910Asn Leu Asp Pro Phe Gly Glu PheLys Asp Asp Ala Ile Leu Trp Glu 915 920 925Val Leu Lys Lys Val GlyLeu Glu Asp Gln Ala Gln His Ala Gly Gly 930 935 940Leu Asp Gly GlnVal Asp Gly Thr Gly Gly Lys Ala Trp Ser Leu Gly945 950 955 960GlnMet Gln Leu Val Cys Leu Ala Arg Ala Ala Leu Arg Ala Val Pro 965 970975Ile Leu Cys Leu Asp Glu Ala Thr Ala Ala Met Asp Pro His Thr Glu980 985 990Ala Ile Val Gln Gln Thr Ile Lys Lys Val Phe Asp Asp ArgThr Thr 995 1000 1005Ile Thr Ile Ala His Arg Leu Asp Thr Ile IleGlu Ser Asp Lys 1010 1015 1020Ile Ile Val Met Glu Gln Gly Ser LeuMet Glu Tyr Glu Ser Pro 1025 1030 1035Ser Lys Leu Leu Ala Asn ArgAsp Ser Met Phe Ser Lys Leu Val 1040 1045 1050Asp Lys Thr Gly ProAla Ala Ala Ala Ala Leu Arg Lys Met Ala 1055 1060 1065Glu Asp PheTrp Ser Thr Arg Ser Ala Gln Gly Arg Asn Gln 1070 10751080691321PRTVolvox carteri 69Met Gly Thr Ile Ser His Pro Ala ArgGly Asn Asp Pro Thr Ala Gly1 5 10 15Phe Phe Asn Lys Phe Ala Phe GlyTrp Met Phe Lys His Val Ser Glu 20 25 30Ala Arg Lys Asn Gly Asp IleAsp Leu Asp Lys Met Gly Met Pro Pro 35 40 45Glu Asn His Ala His GluAla Tyr Asp Met Phe Ala Ser Asn Trp Ala 50 55 60Ala Glu Met Lys LeuLys Asp Ser Gly Ala Lys Pro Ser Leu Val Arg65 70 75 80Ala Leu ArgLys Ser Phe Gly Leu Val Tyr Leu Leu Gly Gly Val Phe 85 90 95Lys CysPhe Trp Ser Thr Phe Val Ile Thr Gly Ala Phe Tyr Phe Val 100 105110Arg Ser Leu Leu Ala His Val Asn Gly Ile Lys Asp Gly Arg Leu Tyr115 120 125Ser Lys Thr Val Ser Gly Trp Cys Leu Met Ala Gly Phe ThrLeu Asp 130 135 140Ala Trp Leu Leu Gly Leu Ser Leu Gln Arg Met GlyTyr Ile Cys Met145 150 155 160Ser Val Gly Ile Arg Ala Arg Ala AlaLeu Val Gln Ala Val Thr His 165 170 175Lys Ala Phe Arg Leu Ser SerVal Arg Ala Asp Gln Ser Ala Ala Ile 180 185 190Val Asn Phe Val SerSer Asp Ile Gln Lys Ile Tyr Asp Gly Ala Leu 195 200 205Glu Phe HisTyr Leu Trp Thr Ala Pro Phe Glu Ala Ala Ala Ile Leu 210 215 220AlaLeu Leu Gly Tyr Leu Thr Asn Asp Ser Met Leu Pro Gly Leu Gly225 230235 240Val Ile Leu Leu Val Leu Pro Leu Gln Tyr Phe Phe Gly Tyr LysIle 245 250 255Ile Gln Ile Lys Leu Gln Asn Ala Lys His Val Ala LeuArg Ser Ser 260 265 270Ile Leu Gln Glu Val Leu Pro Ala Ile Lys LeuVal Lys Tyr Tyr Ala 275 280 285Trp Glu Gln Phe Phe Glu Asp Glu IleSer Lys Ile Arg Arg Glu Glu 290 295 300Met Arg Leu Ser Phe Trp AsnAla Met Met Lys Val Ile Asn Val Ala305 310 315 320Cys Val Phe CysVal Pro Pro Met Thr Ala Phe Val Ile Phe Thr Thr 325 330 335Tyr GluPhe Gln Lys Ala Arg Leu Val Ser Gly Val Ala Phe Thr Thr 340 345350Leu Ser Leu Phe Asn Ile Leu Arg Phe Pro Leu Val Val Leu Pro Lys355 360 365Ala Leu Arg Ala Val Ser Glu Ala His Ala Ser Leu Gln ArgLeu Glu 370 375 380Ser Tyr Leu Leu Glu Asp Val Pro Gln Gly Thr AlaSer Gly Gly Lys385 390 395 400Ser Ser Lys Ser Ser Ala Pro Gly ValHis Ile Asp Asn Ala Val Tyr 405 410 415His His Pro Ser Asn Pro AsnTrp His Leu His Val Pro Arg Phe Asp 420 425 430Val Arg Pro Gly GlnVal Val Ala Val Val Gly Arg Ile Gly Ala Gly 435 440 445Lys Ser SerLeu Val Gln Ala Ile Leu Gly Asn Met Val Lys Glu His 450 455 460GlySer Gln Gln Val Gly Gly Arg Ile Ser Tyr Val Pro Gln Asn Pro465 470475 480Trp Leu Gln Asn Leu Ser Ile Arg Asp Asn Val Thr Phe Gly GluGly 485 490 495Trp Asp Glu Asn Lys Tyr Glu Ala Val Ile Asp Ala CysAla Leu Thr 500 505 510Met Asp Leu Gln Ile Leu Pro Gln Gly Asp GlnSer Lys Ala Gly Ile 515 520 525Arg Gly Val Asn Phe Ser Gly Gly GlnArg Gln Arg Val Asn Leu Ala 530 535 540Arg Cys Ala Tyr Ala Asp AlaAsp Leu Val Leu Leu Asp Asn Ala Leu545 550 555 560Ser Ala Val AspHis His Thr Ala His His Ile Phe Asp Lys Cys Ile 565 570 575Lys GlyLeu Phe Ser Asp Lys Ala Val Val Leu Ile Thr His Gln Ile 580 585590Glu Phe Met Pro Arg Cys Asp Ala Val Ala Ile Met Asp Glu Gly Arg595 600 605Cys Leu Tyr Phe Gly Lys Trp Asn Glu Glu Ser Gln His LeuLeu Gly 610 615 620Lys Leu Leu Pro Ile Thr His Leu Leu His Ala AlaGly Ser Gln Glu625 630
635 640Ala Pro Pro Ala Ala Pro Lys Lys Lys Asp Asp Lys Ala Thr ProGln 645 650 655Lys Ser Gln Ser Leu Gln Leu Thr Leu Ala Pro Thr SerIle Gly Lys 660 665 670Pro Thr Gln Lys Asp Thr Lys Ala Ala Pro LysLeu Thr Ala Phe Lys 675 680 685Ala Ala Leu Ile Tyr Thr Tyr Tyr GlyAsn Ile Leu Leu Val Phe Val 690 695 700Cys Phe Ile Thr Phe Leu AlaAla Gln Thr Cys Arg Gln Met Ser Asp705 710 715 720Phe Trp Val ArgTrp Trp Val Asn Asp Glu Tyr Lys His Phe Pro Lys 725 730 735Arg ThrGly Val Arg Glu Glu Ser Ala Thr Lys Phe Tyr Ala Leu Ile 740 745750Tyr Leu Leu Leu Val Gly Leu Phe Tyr Phe Thr Met Val Ala Arg Gly755 760 765Ser Thr Phe Leu Trp Trp Val Leu Arg Ser Ser Glu Asn IleArg Lys 770 775 780Lys Ala Leu Asn Asn Val Leu Asn Ala Pro Met GlyPhe Phe Leu Val785 790 795 800Thr Pro Val Gly Asp Leu Leu Leu AsnPhe Thr Lys Asp Gln Asp Ile 805 810 815Met Asp Glu Asn Leu Pro AspAla Ile His Phe Met Gly Ile Tyr Gly 820 825 830Leu Ile Leu Leu AlaThr Thr Ile Thr Val Ser Val Thr Ile Asn Phe 835 840 845Phe Gly AlaPhe Thr Gly Phe Leu Ile Ile Met Thr Leu Ile Met Leu 850 855 860AlaIle Tyr Leu Pro Ala Ala Thr Ala Leu Lys Lys Ala Arg Ala Val865 870875 880Ser Gly Gly Gln Leu Val Gly Leu Val Ala Glu Val Leu Glu GlyLeu 885 890 895Asn Val Val Gln Ala Phe Ser Lys Gln Glu Tyr Phe IleGlu Glu Ala 900 905 910Ala Arg Arg Thr Asp Val Thr Asn Ala Ala ValPhe Asn Ala Glu Ser 915 920 925Leu Asn Leu Trp Leu Ala Phe Trp CysAsp Leu Ile Gly Ala Ser Leu 930 935 940Val Gly Val Val Ser Ala PheAla Val Gly Leu Lys Asp Gln Leu Gly945 950 955 960Ala Ala Thr ValGly Leu Ala Phe Ser Asn Ile Ile Gln Met Leu Val 965 970 975Phe TyrThr Trp Val Val Arg Phe Ile Ala Glu Ser Ile Ser Leu Phe 980 985990Asn Ser Val Glu Ala Met Ala Trp Leu Ala Asp Tyr Val Pro Lys Asp995 1000 1005Gly Ile Phe Tyr Asp Gln Lys Gln Leu Asp Gly Val AlaLys Ser 1010 1015 1020Ile Thr Leu Pro Asp Gly Gln Ile Val Pro AlaThr Ser Lys Val 1025 1030 1035Gln Val Val Val Asp Asp Ala Ala LeuAla Arg Trp Pro Ala Thr 1040 1045 1050Gly Asn Ile Arg Phe Glu AspVal Trp Met Gln Tyr Arg Leu Asp 1055 1060 1065Ala Ala Trp Ala LeuLys Gly Val Thr Phe Lys Ile Asn Asp Gly 1070 1075 1080Glu Lys ValGly Ala Val Gly Arg Thr Gly Ser Gly Lys Ser Thr 1085 1090 1095ThrLeu Leu Ala Leu Tyr Arg Met Phe Glu Leu Gly Lys Gly Arg 1100 11051110Ile Leu Ile Asp Gly Val Asp Ile Ala Thr Leu Ser Leu Lys Arg1115 1120 1125Leu Arg Thr Gly Leu Ser Ile Ile Pro Gln Glu Pro ValMet Phe 1130 1135 1140Thr Gly Thr Val Arg Ser Asn Leu Asp Pro PheGly Glu Phe Lys 1145 1150 1155Asp Asp Ser Val Leu Trp Glu Val LeuGln Lys Val Gly Leu Glu 1160 1165 1170Ala Gln Ala Gln His Ala GlyGly Leu Asp Gly Arg Val Asp Gly 1175 1180 1185Thr Gly Gly Lys AlaTrp Ser Leu Gly Gln Met Gln Leu Val Cys 1190 1195 1200Leu Ala ArgAla Ala Leu Arg Ala Val Pro Ile Leu Cys Leu Asp 1205 1210 1215GluAla Thr Ala Ala Met Asp Pro His Thr Glu Gln Val Val Gln 1220 12251230Glu Thr Ile Lys Lys Val Phe Asp Asp Arg Thr Thr Ile Thr Ile1235 1240 1245Ala His Arg Leu Asp Thr Ile Ile Glu Ser Asp Lys ValLeu Val 1250 1255 1260Met Glu Ala Gly Glu Leu Lys Glu Phe Ala ProPro Ala Gln Leu 1265 1270 1275Leu Ala Asn Arg Glu Thr Met Phe SerLys Leu Val Asp Lys Thr 1280 1285 1290Gly Pro Ala Ala Ala Ala AlaLeu Arg Lys Met Ala Asp Glu His 1295 1300 1305Phe Ser Lys Ser GlnAla Arg Ala Ala Ala Gln Arg His 1310 1315 1320702297PRTChlorellavariabilis 70Met Val Pro Leu Leu Ala Gln Arg Gly Arg Ile Arg SerGln Ala Pro1 5 10 15Arg Thr Trp His Pro Asp Pro Gln Pro Leu His AlaGlu Arg Ser Arg 20 25 30Gln Cys Pro Gly Arg Gly Val Arg Ala Ala AlaLys Arg Gly Gly Gly 35 40 45Ser Gly Gly Ala Thr His Lys Ser Lys LysSer Lys Glu Leu Asp Glu 50 55 60Val Ala Ala Phe Glu Gln Leu Met CysAsp Trp Asp Asp Ala Phe Ala65 70 75 80Ala Asp Cys Tyr Asp Asn GluArg Ala Ala Arg Met Ala Arg Leu Ala 85 90 95Glu Glu Gly Tyr Gln HisHis Gly Arg Gly Phe Val Phe Val Arg Ser 100 105 110Arg Leu Asp LysArg Ser Arg Lys Ala Arg Asn Asp Ser Gly Ala Ser 115 120 125Lys GlyPhe Gly Ala Ala Ala Lys Ala Leu Ser Val Glu Gln Gly Thr 130 135140Pro Leu Glu Asn Asn Pro Gln Leu His Leu Leu Ser Trp Thr AlaCys145 150 155 160Tyr Ile Ala Ser Ser Gln Leu Asp Ser Leu Gly GlyLeu Phe Ser Thr 165 170 175Gln Glu Gly Val Leu Leu Pro Asp Ser GlySer Leu Leu Thr Asp Gly 180 185 190Gly Ser Gly Ala Ser Gly Ser AsnAla Ala Asp Ala Val Gly Glu Leu 195 200 205Gln Arg Val Leu Arg GlyGln Asp Leu Ser Gln Leu Arg Gly Tyr Val 210 215 220Gly Ala Pro ProGln Ala Arg Pro Ala Ser Gly Ser Asp Asp Asp Gly225 230 235 240SerSer Thr Thr Gly Ser Asn Asn Gly Ala Ala Gly Glu Gly Ser Glu 245 250255Val Glu Glu Gly Thr Ala Met Gly Gly Ile Arg Arg Tyr Glu Pro Glu260 265 270Ser Gly Glu Leu Val Val Leu Leu Ser Cys Lys Ile Gly GlyLys Pro 275 280 285Ala Val Gly Ala Glu Leu Leu Ala Val Ala Gln AlaGlu Asp Gly Lys 290 295 300His Ala Pro Gly Ala Ser Pro Asp Thr ArgLeu Cys Lys Glu Pro Ser305 310 315 320Gln Ser Ala Phe Asp Leu TrpSer Phe Gly Trp Met Asn Lys Ile Val 325 330 335Pro Ala Ala Arg ArgGly Glu Val Glu Val Ala Asp Leu Pro Leu Pro 340 345 350Glu Ala GlnGln Ala Glu Pro Cys Tyr Glu Glu Leu Asn Thr Asn Trp 355 360 365GluAla Ala Val Gln Glu Ala Lys Lys Ala Gly Lys Glu Pro Lys Leu 370 375380Met Lys Val Leu Trp Lys Thr Tyr Gly Lys Asp Ile Val Leu AlaGly385 390 395 400Ile Phe Lys Leu Met Trp Ser Val Phe Val Ile LeuGly Ala Tyr Tyr 405 410 415Phe Thr Arg Ser Ile Leu Met Cys Ile ArgThr Leu Glu Gly Lys Asp 420 425 430Asp Ser Ile Tyr Asp Thr Glu TrpLys Gly Trp Val Leu Thr Gly Phe 435 440 445Phe Phe Leu Asp Ala TrpLeu Leu Gly Met Met Leu Gln Arg Met Ala 450 455 460Phe Asn Cys LeuLys Val Gly Ile Lys Ala Arg Ala Ala Leu Thr Thr465 470 475 480MetIle Ala Arg Lys Cys Tyr Asn Met Ala His Leu Thr Lys Asp Thr 485 490495Ala Ala Glu Ala Val Gly Phe Val Ala Ser Asp Ile Asn Lys Val Phe500 505 510Glu Gly Ile Gln Glu Val His Tyr Leu Trp Gly Ala Pro ValGlu Ala 515 520 525Gly Ala Ile Leu Ala Leu Leu Gly Thr Leu Val GlyVal Tyr Cys Ile 530 535 540Gly Gly Val Ile Ile Val Cys Met Val ValPro Leu Gln Tyr Tyr Phe545 550 555 560Gly Tyr Lys Ile Ile Lys AsnLys Ile Lys Asn Ala Pro Asn Val Thr 565 570 575Glu Arg Trp Ser IleIle Gln Glu Ile Leu Pro Ala Met Lys Leu Val 580 585 590Lys Tyr TyrAla Trp Glu Arg Phe Phe Glu Lys His Val Ala Asp Met 595 600 605ArgThr Arg Glu Arg His Tyr Met Phe Trp Asn Ala Val Val Lys Thr 610 615620Val Asn Val Thr Met Val Phe Gly Val Pro Pro Met Val Thr PheAla625 630 635 640Val Leu Val Pro Tyr Glu Leu Trp His Val Asp SerSer Thr Ser Glu 645 650 655Pro Tyr Ile Lys Pro Gln Thr Ala Phe ThrMet Leu Ser Leu Phe Asn 660 665 670Val Leu Arg Phe Pro Leu Val ValLeu Pro Lys Ala Met Arg Cys Val 675 680 685Ser Glu Ala Leu Arg SerVal Gly Asn Leu Glu Lys Phe Leu Ala Glu 690 695 700Pro Val Ala ProArg Gln Asp Leu Glu Gly Lys Pro Gly Ala Gln Leu705 710 715 720SerLys Ala Val Leu Arg His Glu Met Asp Thr Ser Gly Phe Thr Leu 725 730735Arg Val Pro Glu Phe Ser Val Lys Ala Gly Glu Leu Val Ala Val Val740 745 750Gly Arg Val Gly Ala Gly Lys Ser Ser Ile Leu Gln Ala MetLeu Gly 755 760 765Asn Met Gln Thr Ala Ser Gly Leu Ala Lys Cys GlnHis Ser Ala Ser 770 775 780Ser Cys Leu Pro Phe Leu Val Glu Gly ThrAla His Ser Gly Gly Arg785 790 795 800Ile Ala Tyr Val Pro Gln ThrAla Trp Cys Gln Asn Leu Ser Leu Arg 805 810 815Asp Asn Ile Thr PheGly Gln Pro Trp Asp Glu Ala Lys Tyr Lys Gln 820 825 830Val Ile HisAla Cys Ala Leu Glu Leu Asp Leu Ala Ile Leu Ala Ala 835 840 845GlyAsp Gln Ser Lys Ala Gly Leu Arg Gly Ile Asn Leu Ser Gly Gly 850 855860Gln Arg Gln Arg Leu Asn Leu Ala Arg Cys Ala Tyr Phe Asp GlyAsp865 870 875 880Leu Val Leu Leu Asp Asn Ala Leu Ser Ala Val AspHis His Thr Ala 885 890 895His His Ile Phe Glu His Cys Val Arg GlyMet Phe Arg Asp Lys Ala 900 905 910Thr Val Leu Val Thr His Gln ValGlu Phe Leu Pro Gln Cys Asp Lys 915 920 925Val Ala Ile Met Asp AspGly Thr Cys Val Tyr Phe Gly Pro Trp Asn 930 935 940Ala Ala Ala GlnGln Leu Leu Ser Lys Tyr Leu Pro Ala Ser His Leu945 950 955 960LeuAla Ala Gly Gly Asn Ala Glu Gln Pro Arg Asp Thr Lys Lys Lys 965 970975Val Val Lys Lys Glu Glu Thr Lys Lys Thr Glu Asp Ala Gly Lys Ala980 985 990Lys Arg Val His Ser Ala Ser Leu Thr Leu Lys Ser Ala LeuTrp Glu 995 1000 1005Tyr Cys Trp Asp Ala Arg Trp Ile Ile Phe CysLeu Ser Leu Phe 1010 1015 1020Phe Phe Leu Thr Ala Gln Ala Ser ArgGln Leu Ala Asp Tyr Phe 1025 1030 1035Ile Arg Trp Trp Thr Arg AspHis Tyr Asn Lys Tyr Gly Val Leu 1040 1045 1050Cys Ile Asp Glu GlyAsp Asn Pro Cys Gly Pro Leu Phe Tyr Val 1055 1060 1065Gln Tyr TyrGly Ile Leu Gly Leu Leu Cys Phe Ile Val Leu Met 1070 1075 1080AlaPhe Arg Gly Ala Phe Leu Tyr Thr Trp Ser Leu Gly Ala Ser 1085 10901095Tyr Arg Gln His Glu Lys Ser Ile His Arg Val Leu Tyr Ala Pro1100 1105 1110Leu Gly Phe Phe Leu Thr Thr Pro Val Gly Asp Leu LeuVal Ser 1115 1120 1125Phe Thr Lys Asp Gln Asp Val Met Asp Asp AlaLeu Pro Asp Ala 1130 1135 1140Leu Tyr Tyr Ala Gly Ile Tyr Gly LeuIle Leu Leu Ala Thr Ala 1145 1150 1155Ile Thr Val Ser Val Thr IlePro Leu Phe Ser Ala Leu Ala Gly 1160 1165 1170Gly Leu Phe Val ValSer Gly Ile Met Leu Ala Ile Tyr Leu Pro 1175 1180 1185Ala Ala ThrHis Leu Lys Lys Leu Arg Met Gly Thr Ser Gly Asp 1190 1195 1200ValVal Thr Leu Ile Ala Glu Ala Leu Asp Gly Leu Gly Val Ile 1205 12101215Gln Ala Tyr Gly Lys Gln Ala Tyr Phe Thr Thr Ile Thr Ser Gln1220 1225 1230Tyr Val Asn Asp Ala His Arg Ala Leu Phe Gly Ala GluSer Leu 1235 1240 1245Asn Leu Trp Leu Ala Phe Ile Cys Asp Phe PheGly Ala Cys Met 1250 1255 1260Val Leu Ser Val Ala Cys Phe Gly IleGly Gln Trp Ser Thr Leu 1265 1270 1275Gly Ser Ser Ser Val Gly LeuAla Phe Ser Gln Ser Ile Gln Met 1280 1285 1290Leu Val Phe Tyr ThrTrp Ser Ile Arg Leu Val Ala Glu Cys Ile 1295 1300 1305Gly Leu PheGly Ser Ala Glu Lys Ile Ala Trp Leu Ala Asn His 1310 1315 1320ThrPro Gln Glu Ala Gly Ser Leu Asp Pro Pro Ser Leu Pro Gly 1325 13301335Ser Gly Glu Thr Lys Ala Ala Pro Lys Lys Arg Gly Thr Ala Gly1340 1345 1350Lys Phe Leu Pro Pro Leu Lys Asp Glu Asp Leu Ala IleVal Pro 1355 1360 1365Thr Gly Gly Pro Lys Leu Pro Ser Gly Trp ProArg Thr Gly Val 1370 1375 1380Leu Glu Phe Asn Gln Val Val Met LysTyr Ala Pro His Leu Pro 1385 1390 1395Pro Ala Leu Arg Gly Val SerPhe Lys Val Lys Ser Gly Asp Lys 1400 1405 1410Val Gly Val Val GlyArg Thr Gly Ser Gly Lys Ser Thr Leu Leu 1415 1420 1425Leu Ala LeuTyr Arg Met Phe Asn Leu Glu Ser Gly Ala Ile Thr 1430 1435 1440LeuAsp Gly Ile Asp Ile Ser Thr Leu Thr Leu Glu Gln Leu Arg 1445 14501455Arg Gly Leu Ser Val Ile Pro Gln Glu Pro Thr Val Phe Ser Gly1460 1465 1470Thr Val Arg Thr Asn Leu Asp Pro Phe Gly Glu Phe GlyAla Asp 1475 1480 1485Ala Ile Leu Trp Glu Ala Leu Arg Asp Cys GlyLeu Glu Glu Gln 1490 1495 1500Val Lys Ala Cys Gly Gly Leu Asp AlaLys Leu Asp Gly Thr Gly 1505 1510 1515Gly Asn Ala Trp Ser Ile GlyGln Gln Gln Leu Met Cys Leu Ala 1520 1525 1530Arg Ala Ala Leu LysLys Val Pro Val Leu Cys Leu Asp Glu Ala 1535 1540 1545Thr Ala AlaMet Asp Pro His Thr Glu Ala His Val Leu Glu Ile 1550 1555 1560IleGlu Arg Ile Phe Ser Asp Arg Thr Met Leu Thr Ile Ala His 1565 15701575Arg Leu Asp Asn Val Ile Arg Ser Asp Leu Val Val Val Met Asp1580 1585 1590Ala Gly Gln Val Cys Glu Met Gly Thr Pro Asp Glu LeuLeu Ala 1595 1600 1605Asn Pro Gln Ser Ala Phe Ser Gln Leu Val AspLys Thr Gly Ala 1610 1615 1620Ala Ser Ala Ala Ala Leu Arg Lys MetAla Ala Asp Phe Leu Asp 1625 1630 1635Glu Arg Ala Arg Gly Gln LysLeu Gly Phe Lys Pro Arg Pro Ser 1640 1645 1650Leu Glu Glu Ser HisIle Cys Val Ala Pro Ser Pro Ser Leu Ile 1655 1660 1665Leu Ser ThrLeu Leu Phe Pro Pro Ala Phe Met Ala Asn Val Thr 1670 1675 1680AlaLeu Leu Leu Pro Lys Pro Val Leu Ser His Ala Pro Val Ser 1685 16901695Ser Gln Thr Val Asn Thr Tyr Ile Arg Leu Asn Ile Ile Gln Leu1700 1705 1710Gln Cys Asn Val Leu His Pro Ala Thr Lys Glu Ala ThrTrp Ser 1715 1720 1725Ser Arg Arg Ile Thr Phe Thr Ala His Leu SerSer Ser Gly Ser 1730 1735 1740Lys Pro Pro Pro Pro Leu Pro Pro LeuThr Glu Leu Pro Glu Gly 1745 1750 1755Arg Gly Leu Asp Trp Ser SerAla Gly Tyr Arg Asp Gly Arg Glu
1760 1765 1770Ala Ile Pro Ser Pro Ser Ala Lys Tyr Ser Ala Ala AspTyr Gly 1775 1780 1785Ala Ala Gly Asp Gly Val Thr Asp Asp Thr GlnAla Leu Gln Val 1790 1795 1800Ala Val Ala Ala Ala His Glu Asp AspGlu Gly Gly Val Val Tyr 1805 1810 1815Leu Gly Ala Gly Thr Phe ValLeu Thr Gln Pro Leu Ser Ile Ala 1820 1825 1830Gly Ser Asn Val ValIle Arg Gly Ala Gly Glu Asp Ala Thr Thr 1835 1840 1845Ile Phe ValPro Leu Pro Leu Ser Asp Val Phe Pro Gly Thr Trp 1850 1855 1860SerMet Asp Ala Ser Gly Lys Val Thr Ser Pro Trp Ile Thr Arg 1865 18701875Gly Gly Phe Leu Ala Phe Ser Gly Arg Arg Thr Lys Ser Ser Asp1880 1885 1890Ser Ser Thr Leu Leu Ala Thr Val Ala Gly Ser Val GluGln Gly 1895 1900 1905Ala Ser Val Ile Pro Val Asp Ser Thr Ala GluPhe Arg Leu Gly 1910 1915 1920Gln Trp Val Arg Ile Ile Ile Asn AspAla Ser Thr Asp Ala Ser 1925 1930 1935Ala Gly Gly Gly Thr Leu GluArg Gly Ser Ser Glu Val Gln Glu 1940 1945 1950Ser Glu Thr Met IleAla Glu Gly Ala Thr Gly Gly Gly Ala Gly 1955 1960 1965Val Arg AlaGln Trp Thr Gly Val Leu His Ala Phe Glu Pro Thr 1970 1975 1980ValGln Cys Ser Gly Val Glu Gln Leu Thr Ile Arg Phe Asn His 1985 19901995Ser Met Met Ala Ala His Leu Ala Glu Arg Gly Tyr Asn Ala Ile2000 2005 2010Glu Leu Glu Asp Val Val Asp Cys Trp Ile Arg Gln ValThr Ile 2015 2020 2025Leu Asn Ala Asp Asn Ala Ile Arg Leu Arg GlyThr Asp His Ser 2030 2035 2040Thr Leu Ser Gly Gln Ala Cys Ser GlyGly Gly Val Val Ala Val 2045 2050 2055Val Pro Val Trp Cys Arg ArgGly Leu Pro Ser Pro Ala Asp Val 2060 2065 2070Thr Val Gly Val ThrGlu Leu Arg Trp Glu Pro Asp Thr Arg Glu 2075 2080 2085Val Asn GlyHis His Ala Ile Thr Val Ser Lys Gly His Ala Asn 2090 2095 2100LeuVal Thr Arg Phe Arg Ile Thr Ala Pro Phe Tyr His Asp Ile 2105 21102115Ser Leu Glu Gly Gly Ala Leu Leu Asn Val Ile Ser Ser Gly Gly2120 2125 2130Gly Ala Asn Leu Asn Leu Asp Leu His Arg Ser Gly ProTrp Gly 2135 2140 2145Asn Leu Phe Ser Gln Leu Gly Met Gly Leu AlaAla Arg Pro Phe 2150 2155 2160Asp Ala Gly Gly Arg Asp Gly Arg GlyAla His Ala Gly Arg Gln 2165 2170 2175Asn Thr Phe Trp Asn Leu GlnPro Gly Asp Val Ala Ala Ala Ala 2180 2185 2190Pro Ala Leu Gln ProSer Ala Ala Ala Gly Asp Ala Arg Arg Leu 2195 2200 2205Leu Val AspGly Asp Ser Leu Leu His Ala Gly Thr Gly Gln Ala 2210 2215 2220ArgLeu Leu Arg Gln Leu Glu Ala Asp Asp Ser Ala Glu Pro Leu 2225 22302235Leu Leu Pro Ser Cys Glu Phe Gly Pro Leu Leu Asn Phe Val Gly2240 2245 2250Gly Phe Ala Gly Glu Leu Cys Lys Ser Ser Gly Trp LeuVal Ala 2255 2260 2265Gly Leu Pro Asp Asp Arg Pro Asp Leu His AlaSer Gln Val Thr 2270 2275 2280Ala Arg Leu Gln His Gly Ala Ala AspAsn Lys Thr His Ala 2285 2290 229571373PRTSynechocystis PCC680371Met Asp Phe Leu Ser Asn Phe Leu Met Asp Phe Val Lys Gln Leu Gln15 10 15Ser Pro Thr Leu Ser Phe Leu Ile Gly Gly Met Val Ile Ala AlaCys 20 25 30Gly Ser Gln Leu Gln Ile Pro Glu Ser Ile Cys Lys Ile IleVal Phe 35 40 45Met Leu Leu Thr Lys Ile Gly Leu Thr Gly Gly Met AlaIle Arg Asn 50 55 60Ser Asn Leu Thr Glu Met Val Leu Pro Ala Leu PheSer Val Ala Ile65 70 75 80Gly Ile Leu Ile Val Phe Ile Ala Arg TyrThr Leu Ala Arg Met Pro 85 90 95Lys Val Lys Thr Val Asp Ala Ile AlaThr Gly Gly Leu Phe Gly Ala 100 105 110Val Ser Gly Ser Thr Met AlaAla Ala Leu Thr Leu Leu Glu Glu Gln 115 120 125Lys Ile Pro Tyr GluAla Trp Ala Gly Ala Leu Tyr Pro Phe Met Asp 130 135 140Ile Pro AlaLeu Val Thr Ala Ile Val Val Ala Asn Ile Tyr Leu Asn145 150 155160Lys Lys Lys Arg Lys Glu Ala Ala Phe Ala Ser Ala Gln Gly Ala Tyr165 170 175Ser Lys Gln Pro Val Ala Ala Gly Asp Tyr Ser Ser Ser SerAsp Tyr 180 185 190Pro Ser Ser Arg Arg Glu Tyr Ala Gln Gln Glu SerGly Asp His Arg 195 200 205Val Lys Ile Trp Pro Ile Val Glu Glu SerLeu Gln Gly Pro Ala Leu 210 215 220Ser Ala Met Leu Leu Gly Val AlaLeu Gly Leu Phe Ala Arg Pro Glu225 230 235 240Ser Val Tyr Glu GlyPhe Tyr Asp Pro Leu Phe Arg Gly Leu Leu Ser 245 250 255Ile Leu MetLeu Val Met Gly Met Glu Ala Trp Ser Arg Ile Ser Glu 260 265 270LeuArg Lys Val Ala Gln Trp Tyr Val Val Tyr Ser Ile Val Ala Pro 275 280285Leu Ala His Gly Phe Ile Ala Phe Gly Leu Gly Met Ile Ala His Tyr290 295 300Ala Thr Gly Phe Ser Met Gly Gly Val Val Val Leu Ala ValIle Ala305 310 315 320Ala Ser Ser Ser Asp Ile Ser Gly Pro Pro ThrLeu Arg Ala Gly Ile 325 330 335Pro Ser Ala Asn Pro Ser Ala Tyr IleGly Ala Ser Thr Ala Ile Gly 340 345 350Thr Pro Val Ala Ile Gly IleAla Ile Pro Leu Phe Leu Gly Leu Ala 355 360 365Gln Thr Ile Gly Gly37072374PRTSynechocystis PCC 6803 72Met Asp Phe Leu Ser Asn Phe LeuThr Asp Phe Val Gly Gln Leu Gln1 5 10 15Ser Pro Thr Leu Ala Phe LeuIle Gly Gly Met Val Ile Ala Ala Leu 20 25 30Gly Thr Gln Leu Val IlePro Glu Ala Ile Ser Thr Ile Ile Val Phe 35 40 45Met Leu Leu Thr LysIle Gly Leu Thr Gly Gly Met Ala Ile Arg Asn 50 55 60Ser Asn Leu ThrGlu Met Leu Leu Pro Val Ala Phe Ser Val Ile Leu65 70 75 80Gly IleLeu Ile Val Phe Ile Ala Arg Phe Thr Leu Ala Lys Leu Pro 85 90 95AsnVal Arg Thr Val Asp Ala Leu Ala Thr Gly Gly Leu Phe Gly Ala 100 105110Val Ser Gly Ser Thr Met Ala Ala Ala Leu Thr Thr Leu Glu Glu Ser115 120 125Lys Ile Ser Tyr Glu Ala Trp Ala Gly Ala Leu Tyr Pro PheMet Asp 130 135 140Ile Pro Ala Leu Val Thr Ala Ile Val Val Ala AsnIle Tyr Leu Asn145 150 155 160Lys Arg Lys Arg Lys Ser Ala Ala AlaSer Ile Glu Glu Ser Phe Ser 165 170 175Lys Gln Pro Val Ala Ala GlyAsp Tyr Gly Asp Gln Thr Asp Tyr Pro 180 185 190Arg Thr Arg Gln GluTyr Leu Ser Gln Gln Glu Pro Glu Asp Asn Arg 195 200 205Val Lys IleTrp Pro Ile Ile Glu Glu Ser Leu Gln Gly Pro Ala Leu 210 215 220SerAla Met Leu Leu Gly Leu Ala Leu Gly Ile Phe Thr Lys Pro Glu225 230235 240Ser Val Tyr Glu Gly Phe Tyr Asp Pro Leu Phe Arg Gly Leu LeuSer 245 250 255Ile Leu Met Leu Ile Met Gly Met Glu Ala Trp Ser ArgIle Gly Glu 260 265 270Leu Arg Lys Val Ala Gln Trp Tyr Val Val TyrSer Leu Ile Ala Pro 275 280 285Ile Val His Gly Phe Ile Ala Phe GlyLeu Gly Met Ile Ala His Tyr 290 295 300Ala Thr Gly Phe Ser Leu GlyGly Val Val Val Leu Ala Val Ile Ala305 310 315 320Ala Ser Ser SerAsp Ile Ser Gly Pro Pro Thr Leu Arg Ala Gly Ile 325 330 335Pro SerAla Asn Pro Ser Ala Tyr Ile Gly Ser Ser Thr Ala Ile Gly 340 345350Thr Pro Ile Ala Ile Gly Val Cys Ile Pro Leu Phe Ile Gly Leu Ala355 360 365Gln Thr Leu Gly Ala Gly 37073370PRTNostoc PCC7120MISC_FEATUREAnabaena 73Met Asp Phe Phe Ser Leu Phe Leu Met AspPhe Val Lys Gln Leu Gln1 5 10 15Ser Pro Thr Leu Gly Phe Leu Ile GlyGly Met Val Ile Ala Ala Leu 20 25 30Gly Ser Glu Leu Ile Ile Pro GluAla Ile Cys Gln Ile Ile Val Phe 35 40 45Met Leu Leu Thr Lys Ile GlyLeu Thr Gly Gly Ile Ala Ile Arg Asn 50 55 60Ser Asn Leu Thr Glu MetVal Leu Pro Ala Ala Ser Ala Val Ala Val65 70 75 80Gly Val Leu ValVal Phe Ile Ala Arg Tyr Thr Leu Ala Lys Leu Pro 85 90 95Lys Val AsnThr Val Asp Ala Ile Ala Thr Gly Gly Leu Phe Gly Ala 100 105 110ValSer Gly Ser Thr Met Ala Ala Ala Leu Thr Leu Leu Glu Glu Gln 115 120125Lys Ile Gln Tyr Glu Ala Trp Ala Ala Ala Leu Tyr Pro Phe Met Asp130 135 140Ile Pro Ala Leu Val Thr Ala Ile Val Val Ala Asn Ile TyrLeu Asn145 150 155 160Lys Lys Lys Arg Ser Ala Ala Gly Glu Tyr LeuSer Lys Gln Ser Val 165 170 175Ala Ala Gly Glu Tyr Pro Asp Gln GlnAsp Tyr Pro Ser Ser Arg Gln 180 185 190Glu Tyr Leu Arg Lys Gln GlnSer Ala Asp Asn Arg Val Lys Ile Trp 195 200 205Pro Ile Val Lys GluSer Leu Gln Gly Pro Ala Leu Ser Ala Met Leu 210 215 220Leu Gly IleAla Leu Gly Leu Phe Thr Gln Pro Glu Ser Val Tyr Lys225 230 235240Ser Phe Tyr Asp Pro Leu Phe Arg Gly Leu Leu Ser Ile Leu Met Leu245 250 255Val Met Gly Met Glu Ala Trp Ser Arg Ile Gly Glu Leu ArgLys Val 260 265 270Ala Gln Trp Tyr Val Val Tyr Ser Val Val Ala ProLeu Val His Gly 275 280 285Phe Ile Ala Phe Gly Leu Gly Met Ile AlaHis Tyr Ala Thr Gly Phe 290 295 300Ser Leu Gly Gly Val Val Ile LeuAla Val Ile Ala Ala Ser Ser Ser305 310 315 320Asp Ile Ser Gly ProPro Thr Leu Arg Ala Gly Ile Pro Ser Ala Asn 325 330 335Pro Ser AlaTyr Ile Gly Ala Ser Thr Ala Ile Gly Thr Pro Ile Ala 340 345 350IleGly Leu Ala Ile Pro Leu Phe Leu Gly Leu Ala Gln Ala Ile Gly 355 360365Gly Arg 37074377PRTCyanothece sp. PCC 7425 74Met Asp Phe Trp SerTyr Phe Leu Met Asp Phe Val Lys Gln Leu Gln1 5 10 15Ser Pro Thr LeuGly Phe Leu Ile Gly Gly Met Val Ile Ala Ala Leu 20 25 30Gly Ser GlnLeu Val Ile Pro Glu Ala Ile Cys Gln Ile Ile Val Phe 35 40 45Met LeuLeu Thr Lys Ile Gly Leu Thr Gly Gly Met Ala Ile Arg Asn 50 55 60SerAsn Leu Thr Glu Met Val Leu Pro Ala Ala Phe Ser Val Ile Ser65 70 7580Gly Ile Leu Ile Val Phe Ile Ala Arg Tyr Thr Leu Ala Lys Leu Pro85 90 95Lys Val Arg Thr Val Asp Ala Ile Ala Thr Gly Gly Leu Phe GlyAla 100 105 110Val Ser Gly Ser Thr Met Ala Ala Ala Leu Thr Leu LeuGlu Glu Glu 115 120 125Lys Ile Pro Tyr Glu Ala Trp Ala Gly Ala LeuTyr Pro Phe Met Asp 130 135 140Ile Pro Ala Leu Val Thr Ala Ile ValIle Ala Asn Ile Tyr Leu Asn145 150 155 160Lys Lys Lys Arg Arg AlaGlu Ser Glu Ala Leu Ser Lys Gln Glu Tyr 165 170 175Leu Gly Lys GlnSer Ile Val Ala Gly Asp Tyr Pro Ala Gln Gln Asp 180 185 190Tyr ProSer Thr Arg Gln Glu Tyr Leu Ser Lys Gln Gln Gly Pro Glu 195 200205Asn Asn Arg Val Lys Ile Trp Pro Ile Val Gln Glu Ser Leu Gln Gly210 215 220Pro Ala Leu Ser Ala Met Leu Leu Gly Val Ala Leu Gly IleLeu Thr225 230 235 240Lys Pro Glu Ser Val Tyr Glu Ser Phe Tyr AspPro Leu Phe Arg Gly 245 250 255Leu Leu Ser Ile Leu Met Leu Val MetGly Met Glu Ala Trp Ser Arg 260 265 270Ile Gly Glu Leu Arg Lys ValAla Gln Trp Tyr Val Val Tyr Ser Val 275 280 285Val Ala Pro Phe ValHis Gly Leu Ile Ala Phe Gly Leu Gly Met Phe 290 295 300Ala His TyrThr Met Gly Phe Ser Met Gly Gly Val Val Val Leu Ala305 310 315320Val Ile Ala Ser Ser Ser Ser Asp Ile Ser Gly Pro Pro Thr Leu Arg325 330 335Ala Gly Ile Pro Ser Ala Asn Pro Ser Ala Tyr Ile Gly AlaSer Thr 340 345 350Ala Ile Gly Thr Pro Ile Ala Ile Gly Leu Cys IlePro Phe Phe Ile 355 360 365Gly Leu Ala Gln Thr Leu Gly Gly Gly 37037575373PRTMicrocystis aeruginosa 75Met Asp Phe Phe Ser Leu Phe ValMet Asp Phe Ile Gln Gln Leu Gln1 5 10 15Ser Pro Thr Leu Ala Phe LeuIle Gly Gly Met Ile Ile Ala Ala Leu 20 25 30Gly Ser Glu Leu Val IlePro Glu Ser Ile Cys Thr Ile Ile Val Phe 35 40 45Met Leu Leu Thr LysIle Gly Leu Thr Gly Gly Ile Ala Ile Arg Asn 50 55 60Ser Asn Leu ThrGlu Met Val Leu Pro Met Ile Phe Ala Val Ile Val65 70 75 80Gly IleIle Val Val Phe Val Ala Arg Tyr Thr Leu Ala Asn Leu Pro 85 90 95LysVal Lys Val Val Asp Ala Ile Ala Thr Gly Gly Leu Phe Gly Ala 100 105110Val Ser Gly Ser Thr Met Ala Ala Gly Leu Thr Val Leu Glu Glu Gln115 120 125Lys Ile Pro Tyr Glu Ala Trp Ala Gly Ala Leu Tyr Pro PheMet Asp 130 135 140Ile Pro Ala Leu Val Thr Ala Ile Val Val Ala AsnIle Tyr Leu Asn145 150 155 160Lys Lys Lys Gln Lys Glu Ala Ala TyrAsp Gln Glu Ser Phe Ser Lys 165 170 175Gln Pro Val Ala Ala Gly AsnTyr Ser Asp Gln Gln Asp Tyr Pro Ser 180 185 190Ser Arg Gln Glu TyrLeu Ser Gln Gln Gln Pro Ala Asp Asn Arg Val 195 200 205Lys Ile TrpPro Ile Ile Glu Glu Ser Leu Arg Gly Pro Ala Leu Ser 210 215 220AlaMet Leu Leu Gly Leu Ala Leu Gly Ile Phe Thr Gln Pro Glu Ser225 230235 240Val Tyr Lys Ser Phe Tyr Asp Pro Leu Phe Arg Gly Leu Leu SerVal 245 250 255Leu Met Leu Val Met Gly Met Glu Ala Trp Ser Arg ValGly Glu Leu 260 265 270Arg Lys Val Ala Gln Trp Tyr Val Val Tyr SerVal Ile Ala Pro Phe 275 280 285Val His Gly Leu Ile Ala Phe Gly LeuGly Met Ile Ala His Tyr Ala 290 295 300Thr Gly Phe Ser Trp Gly GlyVal Val Met Leu Ala Val Ile Ala Ser305 310 315 320Ser Ser Ser AspIle Ser Gly Pro Pro Thr Leu Arg Ala Gly Ile Pro 325 330 335Ser AlaAsn Pro Ser Ala Tyr Ile Gly Ala Ser Thr Ala Ile Gly Thr 340 345350Pro Val Ala Ile Gly Leu Cys Ile Pro Phe Phe Val Gly Leu Ala Gln355 360 365Ala Leu Ser Gly Gly 37076369PRTAnabaena variabilis 76MetAsp Phe Val Ser Leu Phe Val Lys Asp Phe Ile Ala Gln Leu Gln1 5 1015Ser Pro Thr Leu Ala Phe Leu Ile Gly Gly Met Ile Ile Ala Ala Leu20 25 30Gly Ser Glu Leu Val Ile Pro Glu Ser Ile Cys Thr Ile Ile ValPhe 35 40
45Met Leu Leu Thr Lys Ile Gly Leu Thr Gly Gly Ile Ala Ile Arg Asn50 55 60Ser Asn Leu Thr Glu Met Val Leu Pro Met Ile Phe Ala Val IleThr65 70 75 80Gly Ile Thr Ile Val Phe Ile Ser Arg Tyr Thr Leu AlaLys Leu Pro 85 90 95Lys Val Lys Val Val Asp Ala Ile Ala Thr Gly GlyLeu Phe Gly Ala 100 105 110Val Ser Gly Ser Thr Met Ala Ala Gly LeuThr Val Leu Glu Glu Gln 115 120 125Lys Met Ala Tyr Glu Ala Trp AlaGly Ala Leu Tyr Pro Phe Met Asp 130 135 140Ile Pro Ala Leu Val ThrAla Ile Val Ile Ala Asn Ile Tyr Leu Asn145 150 155 160Lys Lys LysArg Lys Glu Ala Val Tyr Ser Thr Glu Gln Pro Val Ala 165 170 175AlaGly Asp Tyr Pro Asp Gln Lys Asp Tyr Pro Ser Ser Arg Gln Glu 180 185190Tyr Leu Ser Gln Gln Lys Gly Asp Glu Asp Asn Arg Val Lys Ile Trp195 200 205Pro Ile Ile Glu Glu Ser Leu Arg Gly Pro Ala Leu Ser AlaMet Leu 210 215 220Leu Gly Leu Ala Leu Gly Leu Phe Thr Gln Pro GluSer Val Tyr Lys225 230 235 240Ser Phe Tyr Asp Pro Ala Phe Arg GlyLeu Leu Ser Ile Leu Met Leu 245 250 255Val Met Gly Met Glu Ala TrpSer Arg Ile Gly Glu Leu Arg Lys Val 260 265 270Ala Gln Trp Tyr ValVal Tyr Ser Val Val Ala Pro Phe Val His Gly 275 280 285Leu Ile AlaPhe Gly Leu Gly Met Ile Ala His Tyr Thr Met Asn Phe 290 295 300SerMet Gly Gly Val Val Ile Leu Ala Val Ile Ala Ser Ser Ser Ser305 310315 320Asp Ile Ser Gly Pro Pro Thr Leu Arg Ala Gly Ile Pro Ser AlaAsn 325 330 335Pro Ser Ala Tyr Ile Gly Ala Ser Thr Ala Val Gly ThrPro Val Ala 340 345 350Ile Gly Leu Cys Ile Pro Phe Phe Leu Gly LeuAla Gln Ala Ile Gly 355 360 365Gly771082PRTChlamydomonasreinhardtii 77Met Leu Pro Gly Leu Gly Val Ile Leu Leu Val Leu ProMet Gln Tyr1 5 10 15Tyr Phe Gly Tyr Lys Ile Val Gln Ile Lys Leu GlnAsn Ala Lys His 20 25 30Val Ala Leu Arg Ser Ala Ile Met Gln Glu ValLeu Pro Ala Ile Lys 35 40 45Leu Val Lys Tyr Tyr Ala Trp Glu Gln PhePhe Glu Asn Gln Ile Ser 50 55 60Lys Val Arg Arg Glu Glu Ile Arg LeuAsn Phe Trp Asn Cys Val Met65 70 75 80Lys Val Ile Asn Val Ala CysVal Phe Cys Val Pro Pro Met Thr Ala 85 90 95Phe Val Ile Phe Thr ThrTyr Glu Phe Gln Arg Ala Arg Leu Val Ser 100 105 110Ser Val Ala PheThr Thr Leu Ser Leu Phe Asn Ile Leu Arg Phe Pro 115 120 125Leu ValVal Leu Pro Lys Ala Leu Arg Ala Val Ser Glu Ala Asn Ala 130 135140Ser Leu Gln Arg Leu Glu Ala Tyr Leu Leu Glu Glu Val Pro SerGly145 150 155 160Thr Ala Ala Val Lys Thr Pro Lys Asn Ala Pro ProGly Ala Val Ile 165 170 175Glu Asn Gly Val Phe His His Pro Ser AsnPro Asn Trp His Leu His 180 185 190Val Pro Lys Phe Glu Val Lys ProGly Gln Val Val Ala Val Val Gly 195 200 205Arg Ile Ala Ala Gly LysSer Ser Leu Val Gln Ala Ile Leu Gly Asn 210 215 220Met Val Lys GluHis Gly Ser Phe Asn Val Gly Gly Arg Ile Ser Tyr225 230 235 240ValPro Gln Asn Pro Trp Leu Gln Asn Leu Ser Leu Arg Asp Asn Val 245 250255Leu Phe Gly Glu Gln Phe Asp Glu Asn Lys Tyr Thr Asp Val Ile Glu260 265 270Ser Cys Ala Leu Thr Leu Asp Leu Gln Ile Leu Ser Asn GlyAsp Gln 275 280 285Ser Lys Ala Gly Ile Arg Gly Val Asn Phe Ser GlyGly Gln Arg Gln 290 295 300Arg Val Asn Leu Ala Arg Cys Ala Tyr AlaAsp Ala Asp Leu Val Leu305 310 315 320Leu Asp Asn Ala Leu Ser AlaVal Asp His His Thr Ala His His Ile 325 330 335Phe Asp Lys Cys IleLys Gly Leu Phe Ser Asp Lys Ala Val Val Leu 340 345 350Val Thr HisGln Ile Glu Phe Met Pro Arg Cys Asp Asn Val Ala Ile 355 360 365MetAsp Glu Gly Arg Cys Leu Tyr Phe Gly Lys Trp Asn Glu Glu Ala 370 375380Gln His Leu Leu Gly Lys Leu Leu Pro Ile Thr His Leu Leu HisAla385 390 395 400Ala Gly Ser Gln Glu Ala Pro Pro Ala Pro Lys LysLys Ala Glu Asp 405 410 415Lys Ala Gly Pro Gln Lys Ser Gln Ser LeuGln Leu Thr Leu Ala Pro 420 425 430Thr Ser Ile Gly Lys Pro Thr GluLys Pro Lys Asp Val Gln Lys Leu 435 440 445Thr Ala Tyr Gln Ala AlaLeu Ile Tyr Thr Trp Tyr Gly Asn Leu Phe 450 455 460Leu Val Gly ValCys Phe Phe Phe Phe Leu Ala Ala Gln Cys Ser Arg465 470 475 480GlnIle Ser Asp Phe Trp Val Arg Trp Trp Val Asn Asp Glu Tyr Lys 485 490495Lys Phe Pro Val Lys Gly Glu Gln Asp Ser Ala Ala Thr Thr Phe Tyr500 505 510Cys Leu Ile Tyr Leu Leu Leu Val Gly Leu Phe Tyr Ile PheMet Ile 515 520 525Phe Arg Gly Ala Thr Phe Leu Trp Trp Val Leu LysSer Ser Glu Thr 530 535 540Ile Arg Arg Lys Ala Leu His Asn Val LeuAsn Ala Pro Met Gly Phe545 550 555 560Phe Leu Val Thr Pro Val GlyAsp Leu Leu Leu Asn Phe Thr Lys Asp 565 570 575Gln Asp Ile Met AspGlu Asn Leu Pro Asp Ala Val His Phe Met Gly 580 585 590Ile Tyr GlyLeu Ile Leu Leu Ala Thr Thr Ile Thr Val Ser Val Thr 595 600 605IleAsn Phe Phe Ala Ala Phe Thr Gly Ala Leu Ile Ile Met Thr Leu 610 615620Ile Met Leu Ser Ile Tyr Leu Pro Ala Ala Thr Ala Leu Lys LysAla625 630 635 640Arg Ala Val Ser Gly Gly Met Leu Val Gly Leu ValAla Glu Val Leu 645 650 655Glu Gly Leu Gly Val Val Gln Ala Phe AsnLys Gln Glu Tyr Phe Ile 660 665 670Glu Glu Ala Ala Arg Arg Thr AsnIle Thr Asn Ser Ala Val Phe Asn 675 680 685Ala Glu Ala Leu Asn LeuTrp Leu Ala Phe Trp Cys Asp Phe Ile Gly 690 695 700Ala Cys Leu ValGly Val Val Ser Ala Phe Ala Val Gly Met Ala Lys705 710 715 720AspLeu Gly Gly Ala Thr Val Gly Leu Ala Phe Ser Asn Ile Ile Gln 725 730735Met Leu Val Phe Tyr Thr Trp Val Val Arg Phe Ile Ser Glu Ser Ile740 745 750Ser Leu Phe Asn Ser Val Glu Gly Met Ala Tyr Leu Ala AspTyr Val 755 760 765Pro His Asp Gly Val Phe Tyr Asp Gln Arg Gln LysAsp Gly Val Ala 770 775 780Lys Gln Ile Val Leu Pro Asp Gly Asn IleVal Pro Ala Ala Ser Lys785 790 795 800Val Gln Val Val Val Asp AspAla Ala Leu Ala Arg Trp Pro Ala Thr 805 810 815Gly Asn Ile Arg PheGlu Asp Val Trp Met Gln Tyr Arg Leu Asp Ala 820 825 830Pro Trp AlaLeu Lys Gly Val Thr Phe Lys Ile Asn Asp Gly Glu Lys 835 840 845ValGly Ala Val Gly Arg Thr Gly Ser Gly Lys Ser Thr Thr Leu Leu 850 855860Ala Leu Tyr Arg Met Phe Glu Leu Gly Lys Gly Arg Ile Leu ValAsp865 870 875 880Gly Val Asp Ile Ala Thr Leu Ser Leu Lys Arg LeuArg Thr Gly Leu 885 890 895Ser Ile Ile Pro Gln Glu Pro Val Met PheThr Gly Thr Val Arg Ser 900 905 910Asn Leu Asp Pro Phe Gly Glu PheLys Asp Asp Ala Ile Leu Trp Glu 915 920 925Val Leu Lys Lys Val GlyLeu Glu Asp Gln Ala Gln His Ala Gly Gly 930 935 940Leu Asp Gly GlnVal Asp Gly Thr Gly Gly Lys Ala Trp Ser Leu Gly945 950 955 960GlnMet Gln Leu Val Cys Leu Ala Arg Ala Ala Leu Arg Ala Val Pro 965 970975Ile Leu Cys Leu Asp Glu Ala Thr Ala Ala Met Asp Pro His Thr Glu980 985 990Ala Ile Val Gln Gln Thr Ile Lys Lys Val Phe Asp Asp ArgThr Thr 995 1000 1005Ile Thr Ile Ala His Arg Leu Asp Thr Ile IleGlu Ser Asp Lys 1010 1015 1020Ile Ile Val Met Glu Gln Gly Ser LeuMet Glu Tyr Glu Ser Pro 1025 1030 1035Ser Lys Leu Leu Ala Asn ArgAsp Ser Met Phe Ser Lys Leu Val 1040 1045 1050Asp Lys Thr Gly ProAla Ala Ala Ala Ala Leu Arg Lys Met Ala 1055 1060 1065Glu Asp PheTrp Ser Thr Arg Ser Ala Gln Gly Arg Asn Gln 1070 1075108078366PRTCyanothece 78Met Asp Phe Leu Ser Leu Phe Val Lys AspPhe Ile Ile Gln Leu Gln1 5 10 15Ser Pro Thr Leu Ala Phe Leu Ile GlyGly Met Val Ile Ala Ala Leu 20 25 30Gly Ser Glu Leu Val Ile Pro GluSer Ile Cys Thr Ile Ile Val Phe 35 40 45Met Leu Leu Thr Lys Ile GlyLeu Thr Gly Gly Ile Ala Ile Arg Asn 50 55 60Ser Asn Leu Thr Glu MetVal Leu Pro Met Ile Cys Ala Val Ile Val65 70 75 80Gly Ile Val ValVal Phe Ile Ala Arg Tyr Thr Leu Ala Lys Leu Pro 85 90 95Lys Val AsnVal Val Asp Ala Ile Ala Thr Gly Gly Leu Phe Gly Ala 100 105 110ValSer Gly Ser Thr Met Ala Ala Gly Leu Thr Val Leu Glu Glu Gln 115 120125Lys Ile Pro Tyr Glu Ala Trp Ala Gly Ala Leu Tyr Pro Phe Met Asp130 135 140Ile Pro Ala Leu Val Thr Ala Ile Val Val Ala Asn Ile TyrLeu Asn145 150 155 160Lys Lys Lys Arg Lys Ala Thr Val Met Gln GluSer Leu Ser Lys Gln 165 170 175Pro Val Ala Ala Gly Asp Tyr Pro SerSer Arg Gln Glu Tyr Val Ser 180 185 190Gln Gln Gln Pro Glu Asp AsnArg Val Lys Ile Trp Pro Ile Ile Glu 195 200 205Glu Ser Leu Arg GlyPro Ala Leu Ser Ala Met Leu Leu Gly Leu Ala 210 215 220Leu Gly IleLeu Thr Gln Pro Glu Ser Val Tyr Lys Gly Phe Tyr Asp225 230 235240Pro Pro Phe Arg Gly Leu Leu Ser Ile Leu Met Leu Val Met Gly Met245 250 255Glu Ala Trp Ser Arg Ile Gly Glu Leu Arg Lys Val Ala GlnTrp Tyr 260 265 270Val Val Tyr Ser Val Ala Ala Pro Phe Ile His GlyLeu Leu Ala Phe 275 280 285Gly Leu Gly Met Ile Ala His Tyr Thr MetGly Phe Ser Met Gly Gly 290 295 300Val Val Ile Leu Ala Val Ile AlaSer Ser Ser Ser Asp Ile Ser Gly305 310 315 320Pro Pro Thr Leu ArgAla Gly Ile Pro Ser Ala Asn Pro Ser Ala Tyr 325 330 335Ile Gly AlaSer Thr Ala Ile Gly Thr Pro Val Ala Ile Gly Leu Cys 340 345 350IlePro Phe Phe Val Gly Leu Ala Gln Ala Ile Gly Gly Phe 355 36036579337PRTVolvox carteriMISC_FEATUREf. nagariensis 79Met Gln ThrThr Met Ser Val Thr Arg Pro Cys Val Gly Leu Arg Pro1 5 10 15Leu ProVal Arg Asn Val Arg Ser Leu Ile Arg Ala Gln Ala Ala Pro 20 25 30GlnGln Val Ser Thr Ala Val Ser Thr Asn Gly Asn Gly Asn Gly Val 35 4045Ala Ala Ala Ser Leu Ser Val Pro Ala Pro Val Ala Ala Pro Ala Gln50 55 60Ala Val Ser Thr Pro Val Arg Ala Val Ser Val Leu Thr Pro ProGln65 70 75 80Val Tyr Glu Asn Ala Ala Asn Val Gly Ala Tyr Lys AlaSer Leu Gly 85 90 95Val Leu Ala Thr Phe Val Gln Gly Ile Gln Ala GlyAla Tyr Ile Ala 100 105 110Phe Gly Ala Phe Leu Ala Cys Ser Val GlyGly Asn Ile Pro Gly Ile 115 120 125Thr Ala Ser Asn Pro Gly Leu AlaLys Leu Leu Phe Ala Leu Val Phe 130 135 140Pro Val Gly Leu Ser MetVal Thr Asn Cys Gly Ala Glu Leu Tyr Thr145 150 155 160Gly Asn ThrMet Met Leu Thr Cys Ala Ile Phe Glu Lys Lys Ala Thr 165 170 175TrpAla Gln Leu Val Lys Asn Trp Val Val Ser Tyr Ala Gly Asn Phe 180 185190Val Gly Ser Ile Ala Met Val Ala Ala Val Val Ala Thr Gly Leu Met195 200 205Ala Ser Asn Gln Leu Pro Val Asn Met Ala Thr Ala Lys SerSer Leu 210 215 220Gly Phe Thr Glu Val Leu Ser Arg Ser Ile Leu CysAsn Trp Leu Val225 230 235 240Cys Cys Ala Val Trp Ser Ala Ser AlaAla Thr Ser Leu Pro Gly Arg 245 250 255Ile Leu Gly Leu Trp Pro ProIle Thr Ala Phe Val Ala Ile Gly Leu 260 265 270Glu His Ser Val AlaAsn Met Phe Val Ile Pro Leu Gly Met Met Leu 275 280 285Gly Ala AspVal Thr Trp Ser Gln Phe Phe Phe Asn Asn Leu Val Pro 290 295 300ValThr Leu Gly Asn Thr Ile Ala Gly Val Val Met Met Ala Val Ala305 310315 320Tyr Ser Val Ser Tyr Gly Ser Leu Gly Lys Thr Pro Lys Pro AlaThr 325 330 335Ala802297PRTChlorella variabilis 80Met Val Pro LeuLeu Ala Gln Arg Gly Arg Ile Arg Ser Gln Ala Pro1 5 10 15Arg Thr TrpHis Pro Asp Pro Gln Pro Leu His Ala Glu Arg Ser Arg 20 25 30Gln CysPro Gly Arg Gly Val Arg Ala Ala Ala Lys Arg Gly Gly Gly 35 40 45SerGly Gly Ala Thr His Lys Ser Lys Lys Ser Lys Glu Leu Asp Glu 50 5560Val Ala Ala Phe Glu Gln Leu Met Cys Asp Trp Asp Asp Ala Phe Ala6570 75 80Ala Asp Cys Tyr Asp Asn Glu Arg Ala Ala Arg Met Ala Arg LeuAla 85 90 95Glu Glu Gly Tyr Gln His His Gly Arg Gly Phe Val Phe ValArg Ser 100 105 110Arg Leu Asp Lys Arg Ser Arg Lys Ala Arg Asn AspSer Gly Ala Ser 115 120 125Lys Gly Phe Gly Ala Ala Ala Lys Ala LeuSer Val Glu Gln Gly Thr 130 135 140Pro Leu Glu Asn Asn Pro Gln LeuHis Leu Leu Ser Trp Thr Ala Cys145 150 155 160Tyr Ile Ala Ser SerGln Leu Asp Ser Leu Gly Gly Leu Phe Ser Thr 165 170 175Gln Glu GlyVal Leu Leu Pro Asp Ser Gly Ser Leu Leu Thr Asp Gly 180 185 190GlySer Gly Ala Ser Gly Ser Asn Ala Ala Asp Ala Val Gly Glu Leu 195 200205Gln Arg Val Leu Arg Gly Gln Asp Leu Ser Gln Leu Arg Gly Tyr Val210 215 220Gly Ala Pro Pro Gln Ala Arg Pro Ala Ser Gly Ser Asp AspAsp Gly225 230 235 240Ser Ser Thr Thr Gly Ser Asn Asn Gly Ala AlaGly Glu Gly Ser Glu 245 250 255Val Glu Glu Gly Thr Ala Met Gly GlyIle Arg Arg Tyr Glu Pro Glu 260 265 270Ser Gly Glu Leu Val Val LeuLeu Ser Cys Lys Ile Gly Gly Lys Pro 275 280 285Ala Val Gly Ala GluLeu Leu Ala Val Ala Gln Ala Glu Asp Gly Lys 290 295 300His Ala ProGly Ala Ser Pro Asp Thr Arg Leu Cys Lys Glu Pro Ser305 310 315320Gln Ser Ala Phe Asp Leu Trp Ser Phe Gly Trp Met Asn Lys Ile Val325 330 335Pro Ala Ala Arg Arg Gly Glu
Val Glu Val Ala Asp Leu Pro Leu Pro 340 345 350Glu Ala Gln Gln AlaGlu Pro Cys Tyr Glu Glu Leu Asn Thr Asn Trp 355 360 365Glu Ala AlaVal Gln Glu Ala Lys Lys Ala Gly Lys Glu Pro Lys Leu 370 375 380MetLys Val Leu Trp Lys Thr Tyr Gly Lys Asp Ile Val Leu Ala Gly385 390395 400Ile Phe Lys Leu Met Trp Ser Val Phe Val Ile Leu Gly Ala TyrTyr 405 410 415Phe Thr Arg Ser Ile Leu Met Cys Ile Arg Thr Leu GluGly Lys Asp 420 425 430Asp Ser Ile Tyr Asp Thr Glu Trp Lys Gly TrpVal Leu Thr Gly Phe 435 440 445Phe Phe Leu Asp Ala Trp Leu Leu GlyMet Met Leu Gln Arg Met Ala 450 455 460Phe Asn Cys Leu Lys Val GlyIle Lys Ala Arg Ala Ala Leu Thr Thr465 470 475 480Met Ile Ala ArgLys Cys Tyr Asn Met Ala His Leu Thr Lys Asp Thr 485 490 495Ala AlaGlu Ala Val Gly Phe Val Ala Ser Asp Ile Asn Lys Val Phe 500 505510Glu Gly Ile Gln Glu Val His Tyr Leu Trp Gly Ala Pro Val Glu Ala515 520 525Gly Ala Ile Leu Ala Leu Leu Gly Thr Leu Val Gly Val TyrCys Ile 530 535 540Gly Gly Val Ile Ile Val Cys Met Val Val Pro LeuGln Tyr Tyr Phe545 550 555 560Gly Tyr Lys Ile Ile Lys Asn Lys IleLys Asn Ala Pro Asn Val Thr 565 570 575Glu Arg Trp Ser Ile Ile GlnGlu Ile Leu Pro Ala Met Lys Leu Val 580 585 590Lys Tyr Tyr Ala TrpGlu Arg Phe Phe Glu Lys His Val Ala Asp Met 595 600 605Arg Thr ArgGlu Arg His Tyr Met Phe Trp Asn Ala Val Val Lys Thr 610 615 620ValAsn Val Thr Met Val Phe Gly Val Pro Pro Met Val Thr Phe Ala625 630635 640Val Leu Val Pro Tyr Glu Leu Trp His Val Asp Ser Ser Thr SerGlu 645 650 655Pro Tyr Ile Lys Pro Gln Thr Ala Phe Thr Met Leu SerLeu Phe Asn 660 665 670Val Leu Arg Phe Pro Leu Val Val Leu Pro LysAla Met Arg Cys Val 675 680 685Ser Glu Ala Leu Arg Ser Val Gly AsnLeu Glu Lys Phe Leu Ala Glu 690 695 700Pro Val Ala Pro Arg Gln AspLeu Glu Gly Lys Pro Gly Ala Gln Leu705 710 715 720Ser Lys Ala ValLeu Arg His Glu Met Asp Thr Ser Gly Phe Thr Leu 725 730 735Arg ValPro Glu Phe Ser Val Lys Ala Gly Glu Leu Val Ala Val Val 740 745750Gly Arg Val Gly Ala Gly Lys Ser Ser Ile Leu Gln Ala Met Leu Gly755 760 765Asn Met Gln Thr Ala Ser Gly Leu Ala Lys Cys Gln His SerAla Ser 770 775 780Ser Cys Leu Pro Phe Leu Val Glu Gly Thr Ala HisSer Gly Gly Arg785 790 795 800Ile Ala Tyr Val Pro Gln Thr Ala TrpCys Gln Asn Leu Ser Leu Arg 805 810 815Asp Asn Ile Thr Phe Gly GlnPro Trp Asp Glu Ala Lys Tyr Lys Gln 820 825 830Val Ile His Ala CysAla Leu Glu Leu Asp Leu Ala Ile Leu Ala Ala 835 840 845Gly Asp GlnSer Lys Ala Gly Leu Arg Gly Ile Asn Leu Ser Gly Gly 850 855 860GlnArg Gln Arg Leu Asn Leu Ala Arg Cys Ala Tyr Phe Asp Gly Asp865 870875 880Leu Val Leu Leu Asp Asn Ala Leu Ser Ala Val Asp His His ThrAla 885 890 895His His Ile Phe Glu His Cys Val Arg Gly Met Phe ArgAsp Lys Ala 900 905 910Thr Val Leu Val Thr His Gln Val Glu Phe LeuPro Gln Cys Asp Lys 915 920 925Val Ala Ile Met Asp Asp Gly Thr CysVal Tyr Phe Gly Pro Trp Asn 930 935 940Ala Ala Ala Gln Gln Leu LeuSer Lys Tyr Leu Pro Ala Ser His Leu945 950 955 960Leu Ala Ala GlyGly Asn Ala Glu Gln Pro Arg Asp Thr Lys Lys Lys 965 970 975Val ValLys Lys Glu Glu Thr Lys Lys Thr Glu Asp Ala Gly Lys Ala 980 985990Lys Arg Val His Ser Ala Ser Leu Thr Leu Lys Ser Ala Leu Trp Glu995 1000 1005Tyr Cys Trp Asp Ala Arg Trp Ile Ile Phe Cys Leu SerLeu Phe 1010 1015 1020Phe Phe Leu Thr Ala Gln Ala Ser Arg Gln LeuAla Asp Tyr Phe 1025 1030 1035Ile Arg Trp Trp Thr Arg Asp His TyrAsn Lys Tyr Gly Val Leu 1040 1045 1050Cys Ile Asp Glu Gly Asp AsnPro Cys Gly Pro Leu Phe Tyr Val 1055 1060 1065Gln Tyr Tyr Gly IleLeu Gly Leu Leu Cys Phe Ile Val Leu Met 1070 1075 1080Ala Phe ArgGly Ala Phe Leu Tyr Thr Trp Ser Leu Gly Ala Ser 1085 1090 1095TyrArg Gln His Glu Lys Ser Ile His Arg Val Leu Tyr Ala Pro 1100 11051110Leu Gly Phe Phe Leu Thr Thr Pro Val Gly Asp Leu Leu Val Ser1115 1120 1125Phe Thr Lys Asp Gln Asp Val Met Asp Asp Ala Leu ProAsp Ala 1130 1135 1140Leu Tyr Tyr Ala Gly Ile Tyr Gly Leu Ile LeuLeu Ala Thr Ala 1145 1150 1155Ile Thr Val Ser Val Thr Ile Pro LeuPhe Ser Ala Leu Ala Gly 1160 1165 1170Gly Leu Phe Val Val Ser GlyIle Met Leu Ala Ile Tyr Leu Pro 1175 1180 1185Ala Ala Thr His LeuLys Lys Leu Arg Met Gly Thr Ser Gly Asp 1190 1195 1200Val Val ThrLeu Ile Ala Glu Ala Leu Asp Gly Leu Gly Val Ile 1205 1210 1215GlnAla Tyr Gly Lys Gln Ala Tyr Phe Thr Thr Ile Thr Ser Gln 1220 12251230Tyr Val Asn Asp Ala His Arg Ala Leu Phe Gly Ala Glu Ser Leu1235 1240 1245Asn Leu Trp Leu Ala Phe Ile Cys Asp Phe Phe Gly AlaCys Met 1250 1255 1260Val Leu Ser Val Ala Cys Phe Gly Ile Gly GlnTrp Ser Thr Leu 1265 1270 1275Gly Ser Ser Ser Val Gly Leu Ala PheSer Gln Ser Ile Gln Met 1280 1285 1290Leu Val Phe Tyr Thr Trp SerIle Arg Leu Val Ala Glu Cys Ile 1295 1300 1305Gly Leu Phe Gly SerAla Glu Lys Ile Ala Trp Leu Ala Asn His 1310 1315 1320Thr Pro GlnGlu Ala Gly Ser Leu Asp Pro Pro Ser Leu Pro Gly 1325 1330 1335SerGly Glu Thr Lys Ala Ala Pro Lys Lys Arg Gly Thr Ala Gly 1340 13451350Lys Phe Leu Pro Pro Leu Lys Asp Glu Asp Leu Ala Ile Val Pro1355 1360 1365Thr Gly Gly Pro Lys Leu Pro Ser Gly Trp Pro Arg ThrGly Val 1370 1375 1380Leu Glu Phe Asn Gln Val Val Met Lys Tyr AlaPro His Leu Pro 1385 1390 1395Pro Ala Leu Arg Gly Val Ser Phe LysVal Lys Ser Gly Asp Lys 1400 1405 1410Val Gly Val Val Gly Arg ThrGly Ser Gly Lys Ser Thr Leu Leu 1415 1420 1425Leu Ala Leu Tyr ArgMet Phe Asn Leu Glu Ser Gly Ala Ile Thr 1430 1435 1440Leu Asp GlyIle Asp Ile Ser Thr Leu Thr Leu Glu Gln Leu Arg 1445 1450 1455ArgGly Leu Ser Val Ile Pro Gln Glu Pro Thr Val Phe Ser Gly 1460 14651470Thr Val Arg Thr Asn Leu Asp Pro Phe Gly Glu Phe Gly Ala Asp1475 1480 1485Ala Ile Leu Trp Glu Ala Leu Arg Asp Cys Gly Leu GluGlu Gln 1490 1495 1500Val Lys Ala Cys Gly Gly Leu Asp Ala Lys LeuAsp Gly Thr Gly 1505 1510 1515Gly Asn Ala Trp Ser Ile Gly Gln GlnGln Leu Met Cys Leu Ala 1520 1525 1530Arg Ala Ala Leu Lys Lys ValPro Val Leu Cys Leu Asp Glu Ala 1535 1540 1545Thr Ala Ala Met AspPro His Thr Glu Ala His Val Leu Glu Ile 1550 1555 1560Ile Glu ArgIle Phe Ser Asp Arg Thr Met Leu Thr Ile Ala His 1565 1570 1575ArgLeu Asp Asn Val Ile Arg Ser Asp Leu Val Val Val Met Asp 1580 15851590Ala Gly Gln Val Cys Glu Met Gly Thr Pro Asp Glu Leu Leu Ala1595 1600 1605Asn Pro Gln Ser Ala Phe Ser Gln Leu Val Asp Lys ThrGly Ala 1610 1615 1620Ala Ser Ala Ala Ala Leu Arg Lys Met Ala AlaAsp Phe Leu Asp 1625 1630 1635Glu Arg Ala Arg Gly Gln Lys Leu GlyPhe Lys Pro Arg Pro Ser 1640 1645 1650Leu Glu Glu Ser His Ile CysVal Ala Pro Ser Pro Ser Leu Ile 1655 1660 1665Leu Ser Thr Leu LeuPhe Pro Pro Ala Phe Met Ala Asn Val Thr 1670 1675 1680Ala Leu LeuLeu Pro Lys Pro Val Leu Ser His Ala Pro Val Ser 1685 1690 1695SerGln Thr Val Asn Thr Tyr Ile Arg Leu Asn Ile Ile Gln Leu 1700 17051710Gln Cys Asn Val Leu His Pro Ala Thr Lys Glu Ala Thr Trp Ser1715 1720 1725Ser Arg Arg Ile Thr Phe Thr Ala His Leu Ser Ser SerGly Ser 1730 1735 1740Lys Pro Pro Pro Pro Leu Pro Pro Leu Thr GluLeu Pro Glu Gly 1745 1750 1755Arg Gly Leu Asp Trp Ser Ser Ala GlyTyr Arg Asp Gly Arg Glu 1760 1765 1770Ala Ile Pro Ser Pro Ser AlaLys Tyr Ser Ala Ala Asp Tyr Gly 1775 1780 1785Ala Ala Gly Asp GlyVal Thr Asp Asp Thr Gln Ala Leu Gln Val 1790 1795 1800Ala Val AlaAla Ala His Glu Asp Asp Glu Gly Gly Val Val Tyr 1805 1810 1815LeuGly Ala Gly Thr Phe Val Leu Thr Gln Pro Leu Ser Ile Ala 1820 18251830Gly Ser Asn Val Val Ile Arg Gly Ala Gly Glu Asp Ala Thr Thr1835 1840 1845Ile Phe Val Pro Leu Pro Leu Ser Asp Val Phe Pro GlyThr Trp 1850 1855 1860Ser Met Asp Ala Ser Gly Lys Val Thr Ser ProTrp Ile Thr Arg 1865 1870 1875Gly Gly Phe Leu Ala Phe Ser Gly ArgArg Thr Lys Ser Ser Asp 1880 1885 1890Ser Ser Thr Leu Leu Ala ThrVal Ala Gly Ser Val Glu Gln Gly 1895 1900 1905Ala Ser Val Ile ProVal Asp Ser Thr Ala Glu Phe Arg Leu Gly 1910 1915 1920Gln Trp ValArg Ile Ile Ile Asn Asp Ala Ser Thr Asp Ala Ser 1925 1930 1935AlaGly Gly Gly Thr Leu Glu Arg Gly Ser Ser Glu Val Gln Glu 1940 19451950Ser Glu Thr Met Ile Ala Glu Gly Ala Thr Gly Gly Gly Ala Gly1955 1960 1965Val Arg Ala Gln Trp Thr Gly Val Leu His Ala Phe GluPro Thr 1970 1975 1980Val Gln Cys Ser Gly Val Glu Gln Leu Thr IleArg Phe Asn His 1985 1990 1995Ser Met Met Ala Ala His Leu Ala GluArg Gly Tyr Asn Ala Ile 2000 2005 2010Glu Leu Glu Asp Val Val AspCys Trp Ile Arg Gln Val Thr Ile 2015 2020 2025Leu Asn Ala Asp AsnAla Ile Arg Leu Arg Gly Thr Asp His Ser 2030 2035 2040Thr Leu SerGly Gln Ala Cys Ser Gly Gly Gly Val Val Ala Val 2045 2050 2055ValPro Val Trp Cys Arg Arg Gly Leu Pro Ser Pro Ala Asp Val 2060 20652070Thr Val Gly Val Thr Glu Leu Arg Trp Glu Pro Asp Thr Arg Glu2075 2080 2085Val Asn Gly His His Ala Ile Thr Val Ser Lys Gly HisAla Asn 2090 2095 2100Leu Val Thr Arg Phe Arg Ile Thr Ala Pro PheTyr His Asp Ile 2105 2110 2115Ser Leu Glu Gly Gly Ala Leu Leu AsnVal Ile Ser Ser Gly Gly 2120 2125 2130Gly Ala Asn Leu Asn Leu AspLeu His Arg Ser Gly Pro Trp Gly 2135 2140 2145Asn Leu Phe Ser GlnLeu Gly Met Gly Leu Ala Ala Arg Pro Phe 2150 2155 2160Asp Ala GlyGly Arg Asp Gly Arg Gly Ala His Ala Gly Arg Gln 2165 2170 2175AsnThr Phe Trp Asn Leu Gln Pro Gly Asp Val Ala Ala Ala Ala 2180 21852190Pro Ala Leu Gln Pro Ser Ala Ala Ala Gly Asp Ala Arg Arg Leu2195 2200 2205Leu Val Asp Gly Asp Ser Leu Leu His Ala Gly Thr GlyGln Ala 2210 2215 2220Arg Leu Leu Arg Gln Leu Glu Ala Asp Asp SerAla Glu Pro Leu 2225 2230 2235Leu Leu Pro Ser Cys Glu Phe Gly ProLeu Leu Asn Phe Val Gly 2240 2245 2250Gly Phe Ala Gly Glu Leu CysLys Ser Ser Gly Trp Leu Val Ala 2255 2260 2265Gly Leu Pro Asp AspArg Pro Asp Leu His Ala Ser Gln Val Thr 2270 2275 2280Ala Arg LeuGln His Gly Ala Ala Asp Asn Lys Thr His Ala 2285 2290229581373PRTSynechococcus PCC7942MISC_FEATUREPCC 7942 81Met Asp PheLeu Ser Asn Phe Leu Met Asp Phe Val Lys Gln Leu Gln1 5 10 15Ser ProThr Leu Ser Phe Leu Ile Gly Gly Met Val Ile Ala Ala Cys 20 25 30GlySer Gln Leu Gln Ile Pro Glu Ser Ile Cys Lys Ile Ile Val Phe 35 4045Met Leu Leu Thr Lys Ile Gly Leu Thr Gly Gly Met Ala Ile Arg Asn50 55 60Ser Asn Leu Thr Glu Met Val Leu Pro Ala Leu Phe Ser Val AlaIle65 70 75 80Gly Ile Leu Ile Val Phe Ile Ala Arg Tyr Thr Leu AlaArg Met Pro 85 90 95Lys Val Lys Thr Val Asp Ala Ile Ala Thr Gly GlyLeu Phe Gly Ala 100 105 110Val Ser Gly Ser Thr Met Ala Ala Ala LeuThr Leu Leu Glu Glu Gln 115 120 125Lys Ile Pro Tyr Glu Ala Trp AlaGly Ala Leu Tyr Pro Phe Met Asp 130 135 140Ile Pro Ala Leu Val ThrAla Ile Val Val Ala Asn Ile Tyr Leu Asn145 150 155 160Lys Lys LysArg Lys Glu Ala Ala Phe Ala Ser Ala Gln Gly Ala Tyr 165 170 175SerLys Gln Pro Val Ala Ala Gly Asp Tyr Ser Ser Ser Ser Asp Tyr 180 185190Pro Ser Ser Arg Arg Glu Tyr Ala Gln Gln Glu Ser Gly Asp His Arg195 200 205Val Lys Ile Trp Pro Ile Val Glu Glu Ser Leu Gln Gly ProAla Leu 210 215 220Ser Ala Met Leu Leu Gly Val Ala Leu Gly Leu PheAla Arg Pro Glu225 230 235 240Ser Val Tyr Glu Gly Phe Tyr Asp ProLeu Phe Arg Gly Leu Leu Ser 245 250 255Ile Leu Met Leu Val Met GlyMet Glu Ala Trp Ser Arg Ile Ser Glu 260 265 270Leu Arg Lys Val AlaGln Trp Tyr Val Val Tyr Ser Ile Val Ala Pro 275 280 285Leu Ala HisGly Phe Ile Ala Phe Gly Leu Gly Met Ile Ala His Tyr 290 295 300AlaThr Gly Phe Ser Met Gly Gly Val Val Val Leu Ala Val Ile Ala305 310315 320Ala Ser Ser Ser Asp Ile Ser Gly Pro Pro Thr Leu Arg Ala GlyIle 325 330 335Pro Ser Ala Asn Pro Ser Ala Tyr Ile Gly Ala Ser ThrAla Ile Gly 340 345 350Thr Pro Val Ala Ile Gly Ile Ala Ile Pro LeuPhe Leu Gly Leu Ala 355 360 365Gln Thr Ile Gly Gly37082374PRTSynechocystis PCC6803 82Met Asp Phe Leu Ser Asn Phe LeuThr Asp Phe Val Gly Gln Leu Gln1 5 10 15Ser Pro Thr Leu Ala Phe LeuIle Gly Gly Met Val Ile Ala Ala Leu 20 25 30Gly Thr Gln Leu Val IlePro Glu Ala Ile Ser Thr Ile Ile Val Phe 35 40 45Met Leu Leu Thr LysIle Gly Leu Thr Gly Gly Met Ala Ile Arg Asn 50 55 60Ser Asn Leu ThrGlu Met Leu Leu Pro Val Ala Phe Ser Val Ile Leu65 70 75 80Gly IleLeu Ile Val Phe Ile Ala Arg Phe Thr Leu Ala Lys Leu Pro 85 90 95AsnVal Arg Thr Val Asp Ala Leu Ala Thr Gly Gly Leu Phe Gly Ala 100 105110Val Ser Gly
Ser Thr Met Ala Ala Ala Leu Thr Thr Leu Glu Glu Ser 115 120 125LysIle Ser Tyr Glu Ala Trp Ala Gly Ala Leu Tyr Pro Phe Met Asp 130 135140Ile Pro Ala Leu Val Thr Ala Ile Val Val Ala Asn Ile Tyr LeuAsn145 150 155 160Lys Arg Lys Arg Lys Ser Ala Ala Ala Ser Ile GluGlu Ser Phe Ser 165 170 175Lys Gln Pro Val Ala Ala Gly Asp Tyr GlyAsp Gln Thr Asp Tyr Pro 180 185 190Arg Thr Arg Gln Glu Tyr Leu SerGln Gln Glu Pro Glu Asp Asn Arg 195 200 205Val Lys Ile Trp Pro IleIle Glu Glu Ser Leu Gln Gly Pro Ala Leu 210 215 220Ser Ala Met LeuLeu Gly Leu Ala Leu Gly Ile Phe Thr Lys Pro Glu225 230 235 240SerVal Tyr Glu Gly Phe Tyr Asp Pro Leu Phe Arg Gly Leu Leu Ser 245 250255Ile Leu Met Leu Ile Met Gly Met Glu Ala Trp Ser Arg Ile Gly Glu260 265 270Leu Arg Lys Val Ala Gln Trp Tyr Val Val Tyr Ser Leu IleAla Pro 275 280 285Ile Val His Gly Phe Ile Ala Phe Gly Leu Gly MetIle Ala His Tyr 290 295 300Ala Thr Gly Phe Ser Leu Gly Gly Val ValVal Leu Ala Val Ile Ala305 310 315 320Ala Ser Ser Ser Asp Ile SerGly Pro Pro Thr Leu Arg Ala Gly Ile 325 330 335Pro Ser Ala Asn ProSer Ala Tyr Ile Gly Ser Ser Thr Ala Ile Gly 340 345 350Thr Pro IleAla Ile Gly Val Cys Ile Pro Leu Phe Ile Gly Leu Ala 355 360 365GlnThr Leu Gly Ala Gly 37083370PRTNostoc sp. PCC 7120 (Anabaena sp.PCC 7120) 83Met Asp Phe Phe Ser Leu Phe Leu Met Asp Phe Val Lys GlnLeu Gln1 5 10 15Ser Pro Thr Leu Gly Phe Leu Ile Gly Gly Met Val IleAla Ala Leu 20 25 30Gly Ser Glu Leu Ile Ile Pro Glu Ala Ile Cys GlnIle Ile Val Phe 35 40 45Met Leu Leu Thr Lys Ile Gly Leu Thr Gly GlyIle Ala Ile Arg Asn 50 55 60Ser Asn Leu Thr Glu Met Val Leu Pro AlaAla Ser Ala Val Ala Val65 70 75 80Gly Val Leu Val Val Phe Ile AlaArg Tyr Thr Leu Ala Lys Leu Pro 85 90 95Lys Val Asn Thr Val Asp AlaIle Ala Thr Gly Gly Leu Phe Gly Ala 100 105 110Val Ser Gly Ser ThrMet Ala Ala Ala Leu Thr Leu Leu Glu Glu Gln 115 120 125Lys Ile GlnTyr Glu Ala Trp Ala Ala Ala Leu Tyr Pro Phe Met Asp 130 135 140IlePro Ala Leu Val Thr Ala Ile Val Val Ala Asn Ile Tyr Leu Asn145 150155 160Lys Lys Lys Arg Ser Ala Ala Gly Glu Tyr Leu Ser Lys Gln SerVal 165 170 175Ala Ala Gly Glu Tyr Pro Asp Gln Gln Asp Tyr Pro SerSer Arg Gln 180 185 190Glu Tyr Leu Arg Lys Gln Gln Ser Ala Asp AsnArg Val Lys Ile Trp 195 200 205Pro Ile Val Lys Glu Ser Leu Gln GlyPro Ala Leu Ser Ala Met Leu 210 215 220Leu Gly Ile Ala Leu Gly LeuPhe Thr Gln Pro Glu Ser Val Tyr Lys225 230 235 240Ser Phe Tyr AspPro Leu Phe Arg Gly Leu Leu Ser Ile Leu Met Leu 245 250 255Val MetGly Met Glu Ala Trp Ser Arg Ile Gly Glu Leu Arg Lys Val 260 265270Ala Gln Trp Tyr Val Val Tyr Ser Val Val Ala Pro Leu Val His Gly275 280 285Phe Ile Ala Phe Gly Leu Gly Met Ile Ala His Tyr Ala ThrGly Phe 290 295 300Ser Leu Gly Gly Val Val Ile Leu Ala Val Ile AlaAla Ser Ser Ser305 310 315 320Asp Ile Ser Gly Pro Pro Thr Leu ArgAla Gly Ile Pro Ser Ala Asn 325 330 335Pro Ser Ala Tyr Ile Gly AlaSer Thr Ala Ile Gly Thr Pro Ile Ala 340 345 350Ile Gly Leu Ala IlePro Leu Phe Leu Gly Leu Ala Gln Ala Ile Gly 355 360 365Gly Arg37084377PRTCyanothece PCC 7425 84Met Asp Phe Trp Ser Tyr Phe LeuMet Asp Phe Val Lys Gln Leu Gln1 5 10 15Ser Pro Thr Leu Gly Phe LeuIle Gly Gly Met Val Ile Ala Ala Leu 20 25 30Gly Ser Gln Leu Val IlePro Glu Ala Ile Cys Gln Ile Ile Val Phe 35 40 45Met Leu Leu Thr LysIle Gly Leu Thr Gly Gly Met Ala Ile Arg Asn 50 55 60Ser Asn Leu ThrGlu Met Val Leu Pro Ala Ala Phe Ser Val Ile Ser65 70 75 80Gly IleLeu Ile Val Phe Ile Ala Arg Tyr Thr Leu Ala Lys Leu Pro 85 90 95LysVal Arg Thr Val Asp Ala Ile Ala Thr Gly Gly Leu Phe Gly Ala 100 105110Val Ser Gly Ser Thr Met Ala Ala Ala Leu Thr Leu Leu Glu Glu Glu115 120 125Lys Ile Pro Tyr Glu Ala Trp Ala Gly Ala Leu Tyr Pro PheMet Asp 130 135 140Ile Pro Ala Leu Val Thr Ala Ile Val Ile Ala AsnIle Tyr Leu Asn145 150 155 160Lys Lys Lys Arg Arg Ala Glu Ser GluAla Leu Ser Lys Gln Glu Tyr 165 170 175Leu Gly Lys Gln Ser Ile ValAla Gly Asp Tyr Pro Ala Gln Gln Asp 180 185 190Tyr Pro Ser Thr ArgGln Glu Tyr Leu Ser Lys Gln Gln Gly Pro Glu 195 200 205Asn Asn ArgVal Lys Ile Trp Pro Ile Val Gln Glu Ser Leu Gln Gly 210 215 220ProAla Leu Ser Ala Met Leu Leu Gly Val Ala Leu Gly Ile Leu Thr225 230235 240Lys Pro Glu Ser Val Tyr Glu Ser Phe Tyr Asp Pro Leu Phe ArgGly 245 250 255Leu Leu Ser Ile Leu Met Leu Val Met Gly Met Glu AlaTrp Ser Arg 260 265 270Ile Gly Glu Leu Arg Lys Val Ala Gln Trp TyrVal Val Tyr Ser Val 275 280 285Val Ala Pro Phe Val His Gly Leu IleAla Phe Gly Leu Gly Met Phe 290 295 300Ala His Tyr Thr Met Gly PheSer Met Gly Gly Val Val Val Leu Ala305 310 315 320Val Ile Ala SerSer Ser Ser Asp Ile Ser Gly Pro Pro Thr Leu Arg 325 330 335Ala GlyIle Pro Ser Ala Asn Pro Ser Ala Tyr Ile Gly Ala Ser Thr 340 345350Ala Ile Gly Thr Pro Ile Ala Ile Gly Leu Cys Ile Pro Phe Phe Ile355 360 365Gly Leu Ala Gln Thr Leu Gly Gly Gly 37037585373PRTMicrocystis aeruginosa 85Met Asp Phe Phe Ser Leu Phe ValMet Asp Phe Ile Gln Gln Leu Gln1 5 10 15Ser Pro Thr Leu Ala Phe LeuIle Gly Gly Met Ile Ile Ala Ala Leu 20 25 30Gly Ser Glu Leu Val IlePro Glu Ser Ile Cys Thr Ile Ile Val Phe 35 40 45Met Leu Leu Thr LysIle Gly Leu Thr Gly Gly Ile Ala Ile Arg Asn 50 55 60Ser Asn Leu ThrGlu Met Val Leu Pro Met Ile Phe Ala Val Ile Val65 70 75 80Gly IleIle Val Val Phe Val Ala Arg Tyr Thr Leu Ala Asn Leu Pro 85 90 95LysVal Lys Val Val Asp Ala Ile Ala Thr Gly Gly Leu Phe Gly Ala 100 105110Val Ser Gly Ser Thr Met Ala Ala Gly Leu Thr Val Leu Glu Glu Gln115 120 125Lys Ile Pro Tyr Glu Ala Trp Ala Gly Ala Leu Tyr Pro PheMet Asp 130 135 140Ile Pro Ala Leu Val Thr Ala Ile Val Val Ala AsnIle Tyr Leu Asn145 150 155 160Lys Lys Lys Gln Lys Glu Ala Ala TyrAsp Gln Glu Ser Phe Ser Lys 165 170 175Gln Pro Val Ala Ala Gly AsnTyr Ser Asp Gln Gln Asp Tyr Pro Ser 180 185 190Ser Arg Gln Glu TyrLeu Ser Gln Gln Gln Pro Ala Asp Asn Arg Val 195 200 205Lys Ile TrpPro Ile Ile Glu Glu Ser Leu Arg Gly Pro Ala Leu Ser 210 215 220AlaMet Leu Leu Gly Leu Ala Leu Gly Ile Phe Thr Gln Pro Glu Ser225 230235 240Val Tyr Lys Ser Phe Tyr Asp Pro Leu Phe Arg Gly Leu Leu SerVal 245 250 255Leu Met Leu Val Met Gly Met Glu Ala Trp Ser Arg ValGly Glu Leu 260 265 270Arg Lys Val Ala Gln Trp Tyr Val Val Tyr SerVal Ile Ala Pro Phe 275 280 285Val His Gly Leu Ile Ala Phe Gly LeuGly Met Ile Ala His Tyr Ala 290 295 300Thr Gly Phe Ser Trp Gly GlyVal Val Met Leu Ala Val Ile Ala Ser305 310 315 320Ser Ser Ser AspIle Ser Gly Pro Pro Thr Leu Arg Ala Gly Ile Pro 325 330 335Ser AlaAsn Pro Ser Ala Tyr Ile Gly Ala Ser Thr Ala Ile Gly Thr 340 345350Pro Val Ala Ile Gly Leu Cys Ile Pro Phe Phe Val Gly Leu Ala Gln355 360 365Ala Leu Ser Gly Gly 37086369PRTAnabaena variabilis 86MetAsp Phe Val Ser Leu Phe Val Lys Asp Phe Ile Ala Gln Leu Gln1 5 1015Ser Pro Thr Leu Ala Phe Leu Ile Gly Gly Met Ile Ile Ala Ala Leu20 25 30Gly Ser Glu Leu Val Ile Pro Glu Ser Ile Cys Thr Ile Ile ValPhe 35 40 45Met Leu Leu Thr Lys Ile Gly Leu Thr Gly Gly Ile Ala IleArg Asn 50 55 60Ser Asn Leu Thr Glu Met Val Leu Pro Met Ile Phe AlaVal Ile Thr65 70 75 80Gly Ile Thr Ile Val Phe Ile Ser Arg Tyr ThrLeu Ala Lys Leu Pro 85 90 95Lys Val Lys Val Val Asp Ala Ile Ala ThrGly Gly Leu Phe Gly Ala 100 105 110Val Ser Gly Ser Thr Met Ala AlaGly Leu Thr Val Leu Glu Glu Gln 115 120 125Lys Met Ala Tyr Glu AlaTrp Ala Gly Ala Leu Tyr Pro Phe Met Asp 130 135 140Ile Pro Ala LeuVal Thr Ala Ile Val Ile Ala Asn Ile Tyr Leu Asn145 150 155 160LysLys Lys Arg Lys Glu Ala Val Tyr Ser Thr Glu Gln Pro Val Ala 165 170175Ala Gly Asp Tyr Pro Asp Gln Lys Asp Tyr Pro Ser Ser Arg Gln Glu180 185 190Tyr Leu Ser Gln Gln Lys Gly Asp Glu Asp Asn Arg Val LysIle Trp 195 200 205Pro Ile Ile Glu Glu Ser Leu Arg Gly Pro Ala LeuSer Ala Met Leu 210 215 220Leu Gly Leu Ala Leu Gly Leu Phe Thr GlnPro Glu Ser Val Tyr Lys225 230 235 240Ser Phe Tyr Asp Pro Ala PheArg Gly Leu Leu Ser Ile Leu Met Leu 245 250 255Val Met Gly Met GluAla Trp Ser Arg Ile Gly Glu Leu Arg Lys Val 260 265 270Ala Gln TrpTyr Val Val Tyr Ser Val Val Ala Pro Phe Val His Gly 275 280 285LeuIle Ala Phe Gly Leu Gly Met Ile Ala His Tyr Thr Met Asn Phe 290 295300Ser Met Gly Gly Val Val Ile Leu Ala Val Ile Ala Ser Ser SerSer305 310 315 320Asp Ile Ser Gly Pro Pro Thr Leu Arg Ala Gly IlePro Ser Ala Asn 325 330 335Pro Ser Ala Tyr Ile Gly Ala Ser Thr AlaVal Gly Thr Pro Val Ala 340 345 350Ile Gly Leu Cys Ile Pro Phe PheLeu Gly Leu Ala Gln Ala Ile Gly 355 360 365Gly87366PRTCyanothece87Met Asp Phe Leu Ser Leu Phe Val Lys Asp Phe Ile Ile Gln Leu Gln15 10 15Ser Pro Thr Leu Ala Phe Leu Ile Gly Gly Met Val Ile Ala AlaLeu 20 25 30Gly Ser Glu Leu Val Ile Pro Glu Ser Ile Cys Thr Ile IleVal Phe 35 40 45Met Leu Leu Thr Lys Ile Gly Leu Thr Gly Gly Ile AlaIle Arg Asn 50 55 60Ser Asn Leu Thr Glu Met Val Leu Pro Met Ile CysAla Val Ile Val65 70 75 80Gly Ile Val Val Val Phe Ile Ala Arg TyrThr Leu Ala Lys Leu Pro 85 90 95Lys Val Asn Val Val Asp Ala Ile AlaThr Gly Gly Leu Phe Gly Ala 100 105 110Val Ser Gly Ser Thr Met AlaAla Gly Leu Thr Val Leu Glu Glu Gln 115 120 125Lys Ile Pro Tyr GluAla Trp Ala Gly Ala Leu Tyr Pro Phe Met Asp 130 135 140Ile Pro AlaLeu Val Thr Ala Ile Val Val Ala Asn Ile Tyr Leu Asn145 150 155160Lys Lys Lys Arg Lys Ala Thr Val Met Gln Glu Ser Leu Ser Lys Gln165 170 175Pro Val Ala Ala Gly Asp Tyr Pro Ser Ser Arg Gln Glu TyrVal Ser 180 185 190Gln Gln Gln Pro Glu Asp Asn Arg Val Lys Ile TrpPro Ile Ile Glu 195 200 205Glu Ser Leu Arg Gly Pro Ala Leu Ser AlaMet Leu Leu Gly Leu Ala 210 215 220Leu Gly Ile Leu Thr Gln Pro GluSer Val Tyr Lys Gly Phe Tyr Asp225 230 235 240Pro Pro Phe Arg GlyLeu Leu Ser Ile Leu Met Leu Val Met Gly Met 245 250 255Glu Ala TrpSer Arg Ile Gly Glu Leu Arg Lys Val Ala Gln Trp Tyr 260 265 270ValVal Tyr Ser Val Ala Ala Pro Phe Ile His Gly Leu Leu Ala Phe 275 280285Gly Leu Gly Met Ile Ala His Tyr Thr Met Gly Phe Ser Met Gly Gly290 295 300Val Val Ile Leu Ala Val Ile Ala Ser Ser Ser Ser Asp IleSer Gly305 310 315 320Pro Pro Thr Leu Arg Ala Gly Ile Pro Ser AlaAsn Pro Ser Ala Tyr 325 330 335Ile Gly Ala Ser Thr Ala Ile Gly ThrPro Val Ala Ile Gly Leu Cys 340 345 350Ile Pro Phe Phe Val Gly LeuAla Gln Ala Ile Gly Gly Phe 355 360 36588378PRTArthrospiraplatensis str. Paraca 88Met Asp Phe Leu Ser Gly Phe Leu Thr Arg PheLeu Ala Gln Leu Gln1 5 10 15Ser Pro Thr Leu Gly Phe Leu Ile Gly GlyMet Val Ile Ala Ala Val 20 25 30Asn Ser Gln Leu Gln Ile Pro Asp AlaIle Tyr Lys Phe Val Val Phe 35 40 45Met Leu Leu Met Lys Val Gly LeuSer Gly Gly Ile Ala Ile Arg Gly 50 55 60Ser Asn Leu Thr Glu Met LeuLeu Pro Ala Val Phe Ala Leu Val Thr65 70 75 80Gly Ile Val Ile ValPhe Ile Gly Arg Tyr Thr Leu Ala Lys Leu Pro 85 90 95Asn Val Lys ThrVal Asp Ala Ile Ala Thr Ala Gly Leu Phe Gly Ala 100 105 110Val SerGly Ser Thr Met Ala Ala Ala Leu Thr Leu Leu Glu Glu Gln 115 120125Gly Met Glu Tyr Glu Ala Trp Ala Ala Ala Leu Tyr Pro Phe Met Asp130 135 140Ile Pro Ala Leu Val Ser Ala Ile Val Leu Ala Ser Ile TyrVal Ser145 150 155 160Lys Gln Lys His Ser Asp Met Ala Asp Glu SerLeu Ser Lys His Glu 165 170 175Ser Leu Ser Lys Gln Pro Val Ala AlaGly Asp Tyr Pro Ser Lys Pro 180 185 190Glu Tyr Pro Thr Thr Arg GlnGlu Tyr Leu Ser Gln Gln Arg Gly Ser 195 200 205Ala Asn Gln Gly ValGlu Ile Trp Pro Ile Ile Lys Glu Ser Leu Gln 210 215 220Gly Ser AlaLeu Ser Ala Leu Leu Leu Gly Leu Ala Leu Gly Leu Leu225 230 235240Thr Arg Pro Glu Ser Val Phe Gln Ser Phe Tyr Glu Pro Leu Phe Arg245 250 255Gly Leu Leu Ser Ile Leu Met Leu Val Met Gly Met Glu AlaThr Ala 260 265 270Arg Leu Gly Glu Leu Arg Lys Val Ala Gln Trp TyrAla Val Tyr Ala 275 280 285Phe Ile Ala Pro Leu Leu His Gly Leu IleAla Phe Gly Leu Gly Met 290 295 300Ile Ala His Val Val Thr Gly PheSer Leu Gly Gly Val Val Ile Leu305 310 315 320Ala Val Ile Ala SerSer Ser Ser Asp Ile Ser Gly Pro Pro Thr Leu 325
330 335Arg Ala Gly Ile Pro Ser Ala Asn Pro Ser Ala Tyr Ile Gly SerSer 340 345 350Thr Ala Val Gly Thr Pro Val Ala Ile Ala Leu Gly IlePro Leu Tyr 355 360 365Ile Gly Leu Ala Gln Ala Leu Met Gly Gly 37037589336PRTChlamydomonas reinhardtii 89Met Gln Thr Thr Met Thr ArgPro Cys Leu Ala Gln Pro Val Leu Arg1 5 10 15Ser Arg Val Leu Arg SerPro Met Arg Val Val Ala Ala Ser Ala Pro 20 25 30Thr Ala Val Thr ThrVal Val Thr Ser Asn Gly Asn Gly Asn Gly His 35 40 45Phe Gln Ala AlaThr Thr Pro Val Pro Pro Thr Pro Ala Pro Val Ala 50 55 60Val Ser AlaPro Val Arg Ala Val Ser Val Leu Thr Pro Pro Gln Val65 70 75 80TyrGlu Asn Ala Ile Asn Val Gly Ala Tyr Lys Ala Gly Leu Thr Pro 85 9095Leu Ala Thr Phe Val Gln Gly Ile Gln Ala Gly Ala Tyr Ile Ala Phe100 105 110Gly Ala Phe Leu Ala Ile Ser Val Gly Gly Asn Ile Pro GlyVal Ala 115 120 125Ala Ala Asn Pro Gly Leu Ala Lys Leu Leu Phe AlaLeu Val Phe Pro 130 135 140Val Gly Leu Ser Met Val Thr Asn Cys GlyAla Glu Leu Phe Thr Gly145 150 155 160Asn Thr Met Met Leu Thr CysAla Leu Ile Glu Lys Lys Ala Thr Trp 165 170 175Gly Gln Leu Leu LysAsn Trp Ser Val Ser Tyr Phe Gly Asn Phe Val 180 185 190Gly Ser IleAla Met Val Ala Ala Val Val Ala Thr Gly Cys Leu Thr 195 200 205ThrAsn Thr Leu Pro Val Gln Met Ala Thr Leu Lys Ala Asn Leu Gly 210 215220Phe Thr Glu Val Leu Ser Arg Ser Ile Leu Cys Asn Trp Leu ValCys225 230 235 240Cys Ala Val Trp Ser Ala Ser Ala Ala Thr Ser LeuPro Gly Arg Ile 245 250 255Leu Ala Leu Trp Pro Cys Ile Thr Ala PheVal Ala Ile Gly Leu Glu 260 265 270His Ser Val Ala Asn Met Phe ValIle Pro Leu Gly Met Met Leu Gly 275 280 285Ala Glu Val Thr Trp SerGln Phe Phe Phe Asn Asn Leu Ile Pro Val 290 295 300Thr Leu Gly AsnThr Ile Ala Gly Val Leu Met Met Ala Ile Ala Tyr305 310 315 320SerIle Ser Phe Gly Ser Leu Gly Lys Ser Ala Lys Pro Ala Thr Ala 325 33033590337PRTVolvox carteriMISC_FEATUREf. nagariensis 90Met Gln ThrThr Met Ser Val Thr Arg Pro Cys Val Gly Leu Arg Pro1 5 10 15Leu ProVal Arg Asn Val Arg Ser Leu Ile Arg Ala Gln Ala Ala Pro 20 25 30GlnGln Val Ser Thr Ala Val Ser Thr Asn Gly Asn Gly Asn Gly Val 35 4045Ala Ala Ala Ser Leu Ser Val Pro Ala Pro Val Ala Ala Pro Ala Gln50 55 60Ala Val Ser Thr Pro Val Arg Ala Val Ser Val Leu Thr Pro ProGln65 70 75 80Val Tyr Glu Asn Ala Ala Asn Val Gly Ala Tyr Lys AlaSer Leu Gly 85 90 95Val Leu Ala Thr Phe Val Gln Gly Ile Gln Ala GlyAla Tyr Ile Ala 100 105 110Phe Gly Ala Phe Leu Ala Cys Ser Val GlyGly Asn Ile Pro Gly Ile 115 120 125Thr Ala Ser Asn Pro Gly Leu AlaLys Leu Leu Phe Ala Leu Val Phe 130 135 140Pro Val Gly Leu Ser MetVal Thr Asn Cys Gly Ala Glu Leu Tyr Thr145 150 155 160Gly Asn ThrMet Met Leu Thr Cys Ala Ile Phe Glu Lys Lys Ala Thr 165 170 175TrpAla Gln Leu Val Lys Asn Trp Val Val Ser Tyr Ala Gly Asn Phe 180 185190Val Gly Ser Ile Ala Met Val Ala Ala Val Val Ala Thr Gly Leu Met195 200 205Ala Ser Asn Gln Leu Pro Val Asn Met Ala Thr Ala Lys SerSer Leu 210 215 220Gly Phe Thr Glu Val Leu Ser Arg Ser Ile Leu CysAsn Trp Leu Val225 230 235 240Cys Cys Ala Val Trp Ser Ala Ser AlaAla Thr Ser Leu Pro Gly Arg 245 250 255Ile Leu Gly Leu Trp Pro ProIle Thr Ala Phe Val Ala Ile Gly Leu 260 265 270Glu His Ser Val AlaAsn Met Phe Val Ile Pro Leu Gly Met Met Leu 275 280 285Gly Ala AspVal Thr Trp Ser Gln Phe Phe Phe Asn Asn Leu Val Pro 290 295 300ValThr Leu Gly Asn Thr Ile Ala Gly Val Val Met Met Ala Val Ala305 310315 320Tyr Ser Val Ser Tyr Gly Ser Leu Gly Lys Thr Pro Lys Pro AlaThr 325 330 335Ala913249DNAEngineered construct (codon optimizedgene) 91atgctgcccg gcctgggcgt catcctgctg gtgctgccca tgcagtactacttcggctac 60aagatcgtgc agatcaagct gcagaacgcc aagcacgtcg ccctgcgctccgccatcatg 120caggaggtgc tgcccgccat caagctggtc aagtactacgcctgggagca gttctttgag 180aaccagatca gcaaggtccg ccgcgaggagatccgcctca acttctggaa ctgcgtgatg 240aaggtcatca acgtggcctgcgtgttctgc gtgccgccca tgaccgcctt cgtcatcttc 300accacctacgagttccagcg cgcccgcctg gtgtccagcg tcgccttcac caccctgtcg360ctgttcaaca ttctgcgctt ccccctggtc gtgctgccca aggccctgcgtgccgtgtcc 420gaggccaacg cgtctctcca gcgcctggag gcctacctgctggaggaggt gccctcgggc 480actgccgccg tcaagacccc caagaacgctccccccggcg ccgtcatcga gaacggtgtg 540ttccaccacc cctccaaccccaactggcac ctgcacgtgc ccaagttcga ggtcaagccc 600ggccaggtcgttgctgtggt gggccgcatc gccgccggca agtcgtccct ggtgcaggcc660atcctcggca acatggtcaa ggagcacggc agcttcaacg tgggcggccgcatctcctac 720gtgccgcaga acccctggct gcagaacctg tccctgcgtgacaacgtgct gtttggcgag 780cagttcgatg agaacaagta caccgacgtcatcgagtcct gcgccctgac cctggacctg 840cagatcctgt ccaacggtgaccagtccaag gccggcatcc gcggtgtcaa cttctccggt 900ggccagcgccagcgcgtgaa cctggcccgc tgcgcctacg ccgacgccga cctggtgctg960ctcgacaacg ccctgtccgc cgtggaccac cacaccgccc accacatcttcgacaagtgc 1020atcaagggcc tgttctccga caaggccgtg gtgctggtcacccaccagat cgagttcatg 1080ccccgctgcg acaacgtggc catcatggacgagggccgct gcctgtactt cggcaagtgg 1140aacgaggagg cccagcacctgctcggcaag ctgctgccca tcacccacct gctgcacgcc 1200gccggctcccaggaggctcc ccccgccccc aagaagaagg ccgaggacaa ggccggcccc1260cagaagtcgc agtcgctgca gctgaccctg gcccccacct ccatcggcaagcccaccgag 1320aagcccaagg acgtccagaa gctgactgcc taccaggccgccctcatcta cacctggtac 1380ggcaacctgt tcctggttgg cgtgtgcttcttcttcttcc tggcggctca gtgctctcgc 1440cagatctccg atttctgggtgcgctggtgg gtgaacgacg agtacaagaa gttccccgtg 1500aagggcgagcaggactcggc cgccaccacc ttctactgcc tcatctacct gctgctggtg1560ggcctgttct acatcttcat gatcttccgc ggcgccactt tcctgtggtgggtgctcaag 1620tcctcggaga ccatccgcag gaaggccctg cacaacgtcctcaacgcgcc catgggcttc 1680ttcctggtca cgccggtcgg cgacctgctgctcaacttca ccaaggacca ggacattatg 1740gatgagaacc tgcccgatgccgttcacttc atgggcatct acggcctgat tctgctggcg 1800accaccatcaccgtgtccgt caccatcaac ttcttcgccg ccttcaccgg cgcgctgatc1860atcatgaccc tcatcatgct ctccatctac ctgcccgccg ccactgccctgaagaaggcg 1920cgcgccgtgt ctggcggcat gctggtcggc ctggttgccgaggttctgga gggccttggc 1980gtggttcagg ccttcaacaa gcaggagtacttcattgagg aggccgcccg ccgcaccaac 2040atcaccaact ccgccgtcttcaacgccgag gcgctgaacc tgtggctggc tttctggtgc 2100gacttcatcggcgcctgcct ggtgggcgtg gtgtccgcct tcgccgtggg catggccaag2160gacctgggcg gcgcgaccgt cggcctggcc ttctccaaca tcattcagatgcttgtgttc 2220tacacctggg tggtccgctt catctccgag tccatctccctcttcaactc cgtcgagggc 2280atggcctacc tcgccgacta cgtgccccacgatggtgtct tctatgacca gcgccagaag 2340gacggcgtcg ccaagcaaatcgtcctgccc gacggcaaca tcgtgcccgc cgcctccaag 2400gtccaggtcgtggttgacga cgccgccctc gcccgctggc ctgccaccgg caacatccgc2460ttcgaggacg tgtggatgca gtaccgcctg gacgctcctt gggctctgaagggcgtcacc 2520ttcaagatca acgacggcga gaaggtcggc gccgtgggccgcaccggctc cggcaagtcc 2580accacgctgc tggcgctgta ccgcatgttcgagctgggca agggccgcat cctggtcgac 2640ggcgtggaca tcgccaccctgtcgctcaag cgcctgcgca ccggcctgtc catcattccc 2700caggagcccgtcatgttcac cggcaccgtg cgctccaacc tggacccctt cggcgagttc2760aaggacgatg ccattctgtg ggaggtgctg aagaaggtcg gcctcgaggaccaggcgcag 2820cacgccggcg gcctggacgg ccaggtcgat ggcaccggcggcaaggcctg gtctctgggc 2880cagatgcagc tggtgtgcct ggctcgcgccgccctgcgcg ccgtgcccat cctgtgcctg 2940gacgaggcta ccgccgccatggacccgcac actgaggcca tcgtgcagca gaccatcaag 3000aaggtgttcgacgaccgcac caccatcacc attgcccacc gcctggacac catcatcgag3060tccgacaaga tcatcgtgat ggagcagggc tcgctgatgg agtacgagtcgccctcgaag 3120ctgctcgcca accgcgactc catgttctcc aagctggtcgacaagaccgg ccccgccgcc 3180gccgctgcgc tgcgcaagat ggccgaggacttctggtcca ctcgctccgc gcagggccgc 3240aaccagtaa3249921008DNAEngineered construct (codon optimized gene)92atgcagacca ctatgactcg cccttgcctt gcccagcccg tgctgcgatc tcgtgtgctc60cggtcgccta tgcgggtggt tgcagcgagc gctcctaccg cggtgacgac agtcgtgacc120tcgaatggaa atggcaacgg tcatttccaa gctgctacta cgcccgtgccccctactccc 180gctcccgtcg ctgtttccgc gcctgtgcgc gctgtgtcggtgctgactcc tcctcaagtg 240tatgagaacg ccattaatgt tggcgcctacaaggccgggc taacgcctct ggcaacgttt 300gtccagggca tccaagccggtgcctacatt gcgttcggcg ccttcctcgc catctccgtg 360ggaggcaacatccccggcgt cgccgccgcc aaccccggcc tggccaagct gctatttgct420ctggtgttcc ccgtgggtct gtccatggtg accaactgcg gcgccgagctgttcacgggc 480aacaccatga tgctcacatg cgcgctcatc gagaagaaggccacttgggg gcagcttctg 540aagaactgga gcgtgtccta cttcggcaacttcgtgggct ccatcgccat ggtcgccgcc 600gtggtggcca ccggctgcctgaccaccaac accctgcctg tgcagatggc caccctcaag 660gccaacctgggcttcaccga ggtgctgtcg cgctccatcc tgtgcaactg gctggtgtgc720tgcgccgtgt ggtccgcctc cgccgccacc tcgctgcccg gccgcatcctggcgctgtgg 780ccctgcatca ccgccttcgt ggccatcggc ctggagcactccgtcgccaa catgttcgtg 840attcctctgg gcatgatgct gggcgctgaggtcacgtgga gccagttctt tttcaacaac 900ctgatccccg tcaccctgggcaacaccatt gctggcgttc tcatgatggc catcgcctac 960tccatctcgttcggctccct cggcaagtcc gccaagcccg ccaccgcg 100893148PRTArabidopsisthalianaMISC_FEATUREFerredoxin1 93Met Ala Ser Thr Ala Leu Ser SerAla Ile Val Ser Thr Ser Phe Leu1 5 10 15Arg Arg Gln Gln Thr Pro IleSer Leu Arg Ser Leu Pro Phe Ala Asn 20 25 30Thr Gln Ser Leu Phe GlyLeu Lys Ser Ser Thr Ala Arg Gly Gly Arg 35 40 45Val Thr Ala Met AlaThr Tyr Lys Val Lys Phe Ile Thr Pro Glu Gly 50 55 60Glu Gln Glu ValGlu Cys Glu Glu Asp Val Tyr Val Leu Asp Ala Ala65 70 75 80Glu GluAla Gly Leu Asp Leu Pro Tyr Ser Cys Arg Ala Gly Ser Cys 85 90 95SerSer Cys Ala Gly Lys Val Val Ser Gly Ser Ile Asp Gln Ser Asp 100 105110Gln Ser Phe Leu Asp Asp Glu Gln Met Ser Glu Gly Tyr Val Leu Thr115 120 125Cys Val Ala Tyr Pro Thr Ser Asp Val Val Ile Glu Thr HisLys Glu 130 135 140Glu Ala Ile Met14594783DNAHomo sapiens94atgtcccatc actgggggta cggcaaacac aacggacctg agcactggca taaggacttc60cccattgcca agggagagcg ccagtcccct gttgacatcg acactcatac agccaagtat120gacccttccc tgaagcccct gtctgtttcc tatgatcaag caacttccctgaggatcctc 180aacaatggtc atgctttcaa cgtggagttt gatgactctcaggacaaagc agtgctcaag 240ggaggacccc tggatggcac ttacagattgattcagtttc actttcactg gggttcactt 300gatggacaag gttcagagcatactgtggat aaaaagaaat atgctgcaga acttcacttg 360gttcactggaacaccaaata tggggatttt gggaaagctg tgcagcaacc tgatggactg420gccgttctag gtattttttt gaaggttggc agcgctaaac cgggccttcagaaagttgtt 480gatgtgctgg attccattaa aacaaagggc aagagtgctgacttcactaa cttcgatcct 540cgtggcctcc ttcctgaatc cttggattactggacctacc caggctcact gaccacccct 600cctcttctgg aatgtgtgacctggattgtg ctcaaggaac ccatcagcgt cagcagcgag 660caggtgttgaaattccgtaa acttaacttc aatggggagg gtgaacccga agaactgatg720gtggacaact ggcgcccagc tcagccactg aagaacaggc aaatcaaagcttccttcaaa 780taa 78395148PRTArabidopsisthalianaMISC_FEATUREFerredoxin2(thale cress) 95Met Ala Ser Thr AlaLeu Ser Ser Ala Ile Val Gly Thr Ser Phe Ile1 5 10 15Arg Arg Ser ProAla Pro Ile Ser Leu Arg Ser Leu Pro Ser Ala Asn 20 25 30Thr Gln SerLeu Phe Gly Leu Lys Ser Gly Thr Ala Arg Gly Gly Arg 35 40 45Val ThrAla Met Ala Thr Tyr Lys Val Lys Phe Ile Thr Pro Glu Gly 50 55 60GluLeu Glu Val Glu Cys Asp Asp Asp Val Tyr Val Leu Asp Ala Ala65 70 7580Glu Glu Ala Gly Ile Asp Leu Pro Tyr Ser Cys Arg Ala Gly Ser Cys85 90 95Ser Ser Cys Ala Gly Lys Val Val Ser Gly Ser Val Asp Gln SerAsp 100 105 110Gln Ser Phe Leu Asp Asp Glu Gln Ile Gly Glu Gly PheVal Leu Thr 115 120 125Cys Ala Ala Tyr Pro Thr Ser Asp Val Thr IleGlu Thr His Lys Glu 130 135 140Glu Asp IleVal14596253PRTArabidopsis thaliana (thalecress)MISC_FEATUREferredoxin-NADP(+)oxidoreductase(FNR1) 96Phe ThrThr Glu Gly Glu Val Pro Tyr Arg Glu Gly Gln Ser Ile Gly1 5 10 15ValIle Pro Glu Gly Ile Asp Lys Asn Gly Lys Pro His Lys Leu Arg 20 2530Leu Tyr Ser Ile Ala Ser Ser Ala Ile Gly Asp Phe Gly Asp Ser Lys35 40 45Thr Val Ser Leu Cys Val Lys Arg Leu Val Tyr Thr Asn Asp GlyGly 50 55 60Glu Ile Val Lys Gly Val Cys Ser Asn Phe Leu Cys Asp LeuLys Pro65 70 75 80Gly Asp Glu Ala Lys Ile Thr Gly Pro Val Gly LysGlu Met Leu Met 85 90 95Pro Lys Asp Pro Asn Ala Thr Ile Ile Met LeuGly Thr Gly Thr Gly 100 105 110Ile Ala Pro Phe Arg Ser Phe Leu TrpLys Met Phe Phe Glu Glu His 115 120 125Glu Asp Tyr Lys Phe Asn GlyLeu Ala Trp Leu Phe Leu Gly Val Pro 130 135 140Thr Ser Ser Ser LeuLeu Tyr Lys Glu Glu Phe Glu Lys Met Lys Glu145 150 155 160Lys AsnPro Asp Asn Phe Arg Leu Asp Phe Ala Val Ser Arg Glu Gln 165 170175Thr Asn Glu Lys Gly Glu Lys Met Tyr Ile Gln Thr Arg Met Ala Glu180 185 190Tyr Ala Glu Glu Leu Trp Glu Leu Leu Lys Lys Asp Asn ThrPhe Val 195 200 205Tyr Met Cys Gly Leu Lys Gly Met Glu Lys Gly IleAsp Asp Ile Met 210 215 220Val Ser Leu Ala Ala Lys Asp Gly Ile AspTrp Leu Glu Tyr Lys Lys225 230 235 240Gln Leu Lys Arg Ser Glu GlnTrp Asn Val Glu Val Tyr 245 25097294PRTArabidopsis thaliana (thalecress)MISC_FEATUREferredoxin-NADP(+)oxidoreductase(FNR2) 97Met AlaThr Thr Met Asn Ala Ala Val Ser Leu Thr Ser Ser Asn Ser1 5 10 15SerSer Phe Pro Ala Thr Ser Cys Ala Ile Ala Pro Glu Arg Ile Arg 20 2530Phe Thr Lys Gly Ala Phe Tyr Tyr Lys Ser Asn Asn Val Val Thr Gly35 40 45Lys Arg Val Phe Ser Ile Lys Ala Gln Ile Thr Thr Glu Thr AspThr 50 55 60Pro Thr Pro Ala Lys Lys Val Glu Lys Val Ser Lys Lys AsnGlu Glu65 70 75 80Gly Val Ile Val Asn Arg Tyr Arg Pro Lys Glu ProTyr Thr Gly Lys 85 90 95Cys Leu Leu Asn Thr Lys Ile Thr Ala Asp AspAla Pro Gly Glu Thr 100 105 110Trp His Met Val Phe Ser His Gln GlyGlu Ile Pro Tyr Arg Glu Gly 115 120 125Gln Ser Val Gly Val Ile AlaAsp Gly Ile Asp Lys Asn Gly Lys Pro 130 135 140His Lys Val Arg LeuTyr Ser Ile Ala Ser Ser Ala Leu Gly Asp Leu145 150 155 160Gly AsnSer Glu Thr Val Ser Leu Cys Val Lys Arg Leu Val Tyr Thr 165 170175Asn Asp Gln Gly Glu Thr Val Lys Gly Val Cys Ser Asn Phe Leu Cys180 185 190Asp Leu Ala Pro Gly Ser Asp Val Lys Leu Thr Gly Pro ValGly Lys 195 200 205Glu Met Leu Met Pro Lys Asp Pro Asn Ala Thr ValIle Met Leu Ala 210 215 220Thr Gly Thr Gly Ile Ala Pro Phe Arg SerPhe Leu Trp Lys Met Phe225 230 235 240Phe Glu Lys His Asp Asp TyrLys Phe Asn Gly Leu Ala Trp Leu Phe 245 250 255Leu Gly Val Pro ThrThr Ser Ser Leu Leu Tyr Gln Glu Glu Phe Asp 260 265 270Lys Met LysAla Lys Ala Pro Glu Asn Phe Arg Val Asp Tyr Ala Ile 275 280 285SerArg
Glu Gln Ala Asn 29098249PRTProteobacteria 98Met Lys Leu Leu Leu IleLeu Gly Ser Val Ile Ala Leu Pro Thr Phe1 5 10 15Ala Ala Gly Gly GlyAsp Leu Asp Ala Ser Asp Tyr Thr Gly Val Ser 20 25 30Phe Trp Leu ValThr Ala Ala Leu Leu Ala Ser Thr Val Phe Phe Phe 35 40 45Val Glu ArgAsp Arg Val Ser Ala Lys Trp Lys Thr Ser Leu Thr Val 50 55 60Ser GlyLeu Val Thr Gly Ile Ala Phe Trp His Tyr Met Tyr Met Arg65 70 7580Gly Val Trp Ile Glu Thr Gly Asp Ser Pro Thr Val Phe Arg Tyr Ile85 90 95Asp Trp Leu Leu Thr Val Pro Leu Leu Ile Cys Glu Phe Tyr LeuIle 100 105 110Leu Ala Ala Ala Thr Asn Val Ala Gly Ser Leu Phe LysLys Leu Leu 115 120 125Val Gly Ser Leu Val Met Leu Val Phe Gly TyrMet Gly Glu Ala Gly 130 135 140Ile Met Ala Ala Trp Pro Ala Phe IleIle Gly Cys Leu Ala Trp Val145 150 155 160Tyr Met Ile Tyr Glu LeuTrp Ala Gly Glu Gly Lys Ser Ala Cys Asn 165 170 175Thr Ala Ser ProAla Val Gln Ser Ala Tyr Asn Thr Met Met Tyr Ile 180 185 190Ile IlePhe Gly Trp Ala Ile Tyr Pro Val Gly Tyr Phe Thr Gly Tyr 195 200205Leu Met Gly Asp Gly Gly Ser Ala Leu Asn Leu Asn Leu Ile Tyr Asn210 215 220Leu Ala Asp Phe Val Asn Lys Ile Leu Phe Gly Leu Ile IleTrp Asn225 230 235 240Val Ala Val Lys Glu Ser Ser Asn Ala24599446DNAArabidopsis thaliana (thale cress) 99atttcgaaagagaatctcag aaagatcaat ctagagagac ccgttcgtct cctttcctta 60agccattacctctgaaacca tccaaggctt tggttgcaac tggaggcaga gcacagaggc120ttcaagttaa ggccctcaag atggacaagg ctttgaccgg tatctccgcggctgctctta 180ctgcttcgat ggtgattccg gagatagctg aagctgctggttctggaatc tctccttccc 240tcaagaattt cttgctcagc attgcttctggtggcctcgt cctcactgtc atcattggtg 300tcgtcgtcgg cgtctccaactttgaccctg tcaagagaac ctaagaccta tatatctttc 360ttacatcattattgtaatct gttctccttc tgtgtattcg tttcaatgtt gcagcaatga420acttttggat aaaaaaaaaa aaaaaa 446100642DNAMus musculus100aaggcagaag caccggtcag ctgggggaag ggacacagag gaagagacggagtgtacagg 60gaccaaggtt gtatgtcaag gagcaaagag caggaagaca ggaggctttgagcacacacg 120gctttgtcta ttccagtaac aacccccttg ctgccgctcaccggttccat ggagataata 180tttggccaga ataagaaaga acagctggagccagttcagg ccaaagtgac aggcagcatt 240ccagcatggc tgcaggggaccctgctccga aacgggcccg ggatgcacac agtgggagag 300agcaagtacaaccattggtt tgatggcctg gcccttctcc acagtttctc catcagagat360ggggaggtct tctacaggag caaatacctg cagagtgaca cctacatcgccaacattgag 420gccaacagaa tcgtggtgtc tgagttcgga accatggcctacccggaccc ctgcaaaaac 480atcttttcca aagctttctc ctacttgtctcacaccatcc ccgacttcac agacaactgt 540ctgatcaaca tcatgaaatgtggagaagac ttctatgcaa ccacggagac caactacatc 600aggaaaatcgacccccagac cctagagacc ttggagaagg tg 642101217PRTArabidopsisthalianaMISC_FEATURE(thale cress) 101Met Ala Ser Leu Ser Thr IleThr Gln Pro Ser Leu Val His Ile Pro1 5 10 15Gly Glu Ser Val Leu HisHis Val Pro Ser Thr Cys Ser Phe Pro Trp 20 25 30Lys Pro Thr Ile AsnThr Lys Arg Ile Ile Cys Ser Pro Ala Arg Asn 35 40 45Ser Ser Glu ValSer Ala Glu Ala Glu Thr Glu Gly Gly Ser Ser Thr 50 55 60Ala Val AspGlu Ala Pro Lys Glu Ser Pro Ser Leu Ile Ser Ala Leu65 70 75 80AsnVal Glu Arg Ala Leu Arg Gly Leu Pro Ile Thr Asp Val Asp His 85 9095Tyr Gly Arg Leu Gly Ile Phe Arg Asn Cys Ser Tyr Asp Gln Val Thr100 105 110Ile Gly Tyr Lys Glu Arg Val Lys Glu Leu Lys Glu Gln GlyLeu Asp 115 120 125Glu Glu Gln Leu Lys Thr Lys Met Asp Leu Ile LysSer Tyr Thr Ile 130 135 140Leu Ser Thr Val Glu Glu Arg Arg Met TyrAsp Trp Ser Leu Ala Arg145 150 155 160Ser Glu Lys Ala Glu Arg TyrVal Trp Pro Phe Glu Val Asp Ile Met 165 170 175Glu Pro Ser Arg GluGlu Pro Pro Pro Gln Glu Pro Glu Asp Val Gly 180 185 190Pro Thr ArgIle Leu Gly Tyr Phe Ile Gly Ala Trp Leu Val Leu Gly 195 200 205ValAla Leu Ser Val Ala Phe Asn Arg 210 21510253PRTCyanophora paradoxa102Met Asn Ala Phe Val Ala Ser Val Ala Pro Ile Ala Val Ala Gly Ser15 10 15Ala Thr Leu Ser Ser Ala Val Cys Ala Gln Lys Lys Ala Phe PheGly 20 25 30Ala Gln Val Ala Ala Lys Lys Thr Thr Phe Glu Ala Ala ProAla Arg 35 40 45Phe Ile Val Arg Ala 5010345PRTArabidopsis thaliana103Met Ala Thr Gln Ala Ala Gly Ile Phe Asn Ser Ala Ile Thr Thr Ala15 10 15Ala Thr Ser Gly Val Lys Lys Leu His Phe Phe Ser Thr Thr HisArg 20 25 30Pro Lys Ser Leu Ser Phe Thr Lys Thr Ala Ile Arg Ala 3540 4510430PRTArabidopsis thalianaMISC_FEATURE(thalecress)MISC_FEATURECAB transit peptide(thale cress) 104Met Gln SerSer Ala Val Phe Ser Leu Ser Pro Ser Leu Pro Leu Leu1 5 10 15Lys ProArg Arg Leu Ser Leu Arg His His Pro Ile Thr Thr 20 253010544PRTArabidopsis thalianaMISC_FEATURE(thalecress)MISC_FEATUREPGR5 transit peptide(thale cress) 105Met Ala AlaAla Ser Ile Ser Ala Ile Gly Cys Asn Gln Thr Leu Ile1 5 10 15Gly ThrSer Phe Tyr Gly Gly Trp Gly Ser Ser Ile Ser Gly Glu Asp 20 25 30TyrGln Thr Met Leu Ser Lys Thr Val Ala Pro Pro 35401062955DNAArabidopsis thalianamisc_featurePCRL1genemisc_featurePCRL1 gene(thale cress) 106catatttgat tttcacatggattaacgaaa ctatattatg gaacacattc aaaattataa 60caacaaaaaa aatacaagtattatttcaaa actacacaag gttgtgctta tttcttgaat 120tattttactttcctaatgag agcaaagttt ctcaaagaag taatcatatg atgtttttct180ttgaatgtgc ctcacactta cttacaaaca caacacaagc caatgagagctacatgaaaa 240gatctgaaga ttatacaaaa cagcatacaa actttggtttttctccttct tcttcaattt 300ctccaccttc ttcatttgtt agtattaattttacatacac ttctacataa ccctgagaaa 360aagaaaaccc taaaattttgaattttccat tgaatcaaga aagatttcat cagaaatcaa 420agttgagataagaattaaac cttggctctt agatttaagc tttccctcct tctggtaatg480tgatcaaacg agaacctgag tcatagacca tctccgttcc acagctaaaaaccagaagaa 540tcataagact tcaagaaacg ttgtagacaa tttgtgtgatcgattcgagt ctacagctga 600gaagcttacc ctgagcattt gacattgttggtgttactat cattggggat ggatagtatc 660gttccaaaga aagatacattctctgttcca cagtttgggc aagggccctg tgaaagatat 720gttccacgaaaattaaaagc atttcataat aatcgcataa aactcgtagg atttggcact780ataccaatcc aaatttgtag cgtttagcac aaaatagatt attatctcaagtctaatctc 840ttgtttaagc atttttgata ctgagaaaac aagatttagttctataactt ttatttttcc 900acttcatgaa ctgatcttgg aagatgattaatgtttttac cttcaagatc aagaagtctt 960tgaggatcag tttggtgagtgataacgcca gatatacaat tgcaggcacc gcagcgaacc 1020atgtgaaaatgaaactgaat ggttccggaa gctgcagatt tttgtttttg ttttttaatc1080agttgcatga atactggaac aattactacg agtatatatt ctccaaaccatagtagaata 1140gtcgaaagag gttttacctc gagcaggtac gtgatttcaaaaccagtaat gtcatcaaga 1200aagaaaaatc tggagaacaa ttgaagaaaacaacaatttt aaactaacta acataagcta 1260agatcatgtg atttgaaagttgagagagga acagaaccgg aggtactcac agtccgagag 1320caacaacagttgctggtaca ttcaacaaga acattttgaa gtaatcaatg gcaagatcac1380tataaacctg cataatcaag gaggtccaca agtctataac atctctagagtttgtcccaa 1440acaatgaatc taatgttatg ttctgtaatg tccaaagaatatatgagcta tccgaacaat 1500taagagtttt taccttttta ctacgaagactgcatcttgg accctcacac acaatctcac 1560tgccgtccat ctacagaaaccaaagaaaca atcataacgt ttgtccaaat tacacatgta 1620acaagatggatgaaactaag aaatagtatt tgtaagtata aatagattaa gaacctttag1680tttcattttg agcttatcat actcttcatc actcaagatt ggatttccagagacataagc 1740cattgaagct tcaaggaatc tttgttcatc agaacctacaatacatagat aaaattagat 1800caagaatcaa gaacctaggc gaatggattattgacaaaac tataaatcat aagtgttcat 1860tacttagcat gacaacactgcttccttccc acatcaactc ttctttaagg ttatcaaact 1920cttcattagacataatcgct ttgccttcgt aataaaacga ctgcaaaaga aaagaaacag1980aaacaatcct cgattatata gagataaacc catactaatg ataaaaacactttatttgat 2040gtgttacttg catcgcttgg aggaactctt gttccatttcaccgatagtt ctcttctcat 2100tcttgttgat gctacaataa ggtaaaatcttgctatcaac ttcttcccca cccacctgac 2160ctgaagacaa gtcataaaaatgattttaag aagtaaggaa actctcaagg agcaatcttt 2220tagtggattagagtataaaa actaaaaatc cacagaggaa aaaagttcca tataacaact2280tttcttaact agaattaaag cttgagtgat tttattctat gattgaataaaatcaaaact 2340ttctcaaaag ccactgtgtt cccaaacaat gatcagagacaaaatcaaag ctacaataca 2400acagcttttc tcaactaaat ttgaagattgagtgcttttt tgtttcgatc acataacgat 2460gagttaataa cttaagaaccttaagctaca cacaaatttt aatcctaaaa aggctacaaa 2520ttggaaatcatttatctaat tatcttctat gatcataaaa atctcaactt ttcacaccaa2580tttcgttccc aaagaaagat cagaggcaaa aacaaataaa aaaatcgaaactttaaagag 2640gcaaataaaa atcgagacct gattgatcag tagaagctttaaggggcaat aaggtaagtc 2700ttcgtctgag agaaatcgat cgtccatgggtaaagggagc aggacactgt gtcctcgaag 2760aagaacaagt gatgggtttgcgagaaattg cagaaaatct agggattgtt agagtaaaag 2820ccatcgtctttatccctcac gccgatgatt gagtgagatc gttgttttct cttgtccggg2880acgaagaaca aaaaaaaaag ttagaagctt tggatttgtg tggttgagaattgagatggt 2940gatgtttttt actgt 295510755PRTArabidopsis thaliana(thale cress)MISC_FEATURERubiscoMISC_FEATURERubisco(thale cress)107Met Ala Ser Ser Met Leu Ser Ser Ala Thr Met Val Ala Ser Pro Ala15 10 15Gln Ala Thr Met Val Ala Pro Phe Asn Gly Leu Lys Ser Ser AlaAla 20 25 30Phe Pro Ala Thr Arg Lys Ala Asn Asn Asp Ile Thr Ser IleThr Ser 35 40 45Asn Gly Gly Arg Val Asn Cys 50 55108795DNAhomosapien 108gaattcatgt ctcatcattg gggttatggt aaacacaatg gtcctgaacactggcataaa 60gactttccaa ttgcaaaagg tgaacgtcaa tcacctgttg atattgacactcatacagct 120aaatatgacc cttctttaaa accattatct gtttcatatgatcaagcaac ttctttacgt 180attttaaaca atggtcatgc ttttaatgtagaatttgatg actctcaaga taaagcagta 240ttaaaaggtg gtccattagatggtacttac cgtttaattc aatttcactt tcactggggt 300tcattagatggtcaaggttc agaacatact gtagataaaa aaaaatatgc tgcagaatta360cacttagttc actggaacac aaaatatggt gattttggta aagctgtacaacaacctgat 420ggtttagctg ttttaggtat ttttttaaaa gttggtagtgctaaaccagg tcttcaaaaa 480gttgttgatg tattagattc aattaaaacaaaaggtaaaa gtgctgactt tactaatttc 540gatcctcgtg gtttacttcctgaatcttta gattactgga catatccagg ttcattaaca 600acacctcctcttttagaatg tgtaacatgg attgtattaa aagaaccaat tagtgtaagt660agtgaacaag tattaaaatt ccgtaaactt aatttcaatg gtgaaggtgaaccagaagaa 720ttaatggttg ataactggcg tccagctcaa ccattaaaaaatcgtcaaat taaagcttca 780ttcaaataag catgc 795
* * * * *