CN116239703A - Fusion protein, efficient specific base editing system containing same and application - Google Patents

Fusion protein, efficient specific base editing system containing same and application Download PDF

Info

Publication number
CN116239703A
CN116239703A CN202310186267.3A CN202310186267A CN116239703A CN 116239703 A CN116239703 A CN 116239703A CN 202310186267 A CN202310186267 A CN 202310186267A CN 116239703 A CN116239703 A CN 116239703A
Authority
CN
China
Prior art keywords
seq
fusion protein
amino acid
fragment
editing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310186267.3A
Other languages
Chinese (zh)
Inventor
陆春菊
徐天宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jikang Technology Zhuhai Co ltd
Original Assignee
Jikang Technology Zhuhai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jikang Technology Zhuhai Co ltd filed Critical Jikang Technology Zhuhai Co ltd
Priority to CN202310186267.3A priority Critical patent/CN116239703A/en
Publication of CN116239703A publication Critical patent/CN116239703A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K67/00Rearing or breeding animals, not otherwise provided for; New or modified breeds of animals
    • A01K67/027New or modified breeds of vertebrates
    • A01K67/0275Genetically modified vertebrates, e.g. transgenic
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • A61K38/16Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • A61K38/43Enzymes; Proenzymes; Derivatives thereof
    • A61K38/46Hydrolases (3)
    • A61K38/465Hydrolases (3) acting on ester bonds (3.1), e.g. lipases, ribonucleases
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • A61K38/16Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • A61K38/43Enzymes; Proenzymes; Derivatives thereof
    • A61K38/46Hydrolases (3)
    • A61K38/50Hydrolases (3) acting on carbon-nitrogen bonds, other than peptide bonds (3.5), e.g. asparaginase
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/0008Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'non-active' part of the composition delivered, e.g. wherein such 'non-active' part is not delivered simultaneously with the 'active' part of the composition
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/04Antibacterial agents
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P31/00Antiinfectives, i.e. antibiotics, antiseptics, chemotherapeutics
    • A61P31/12Antivirals
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P35/00Antineoplastic agents
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P37/00Drugs for immunological or allergic disorders
    • A61P37/02Immunomodulators
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/65Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression using markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/78Hydrolases (3) acting on carbon to nitrogen bonds other than peptide bonds (3.5)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12YENZYMES
    • C12Y305/00Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5)
    • C12Y305/04Hydrolases acting on carbon-nitrogen bonds, other than peptide bonds (3.5) in cyclic amidines (3.5.4)
    • C12Y305/04004Adenosine deaminase (3.5.4.4)
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/64Fluorescence; Phosphorescence
    • G01N21/6428Measuring fluorescence of fluorescent products of reactions or of fluorochrome labelled reactive substances, e.g. measuring quenching effects, using measuring "optrodes"
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2227/00Animals characterised by species
    • A01K2227/10Mammal
    • A01K2227/105Murine
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2267/00Animals characterised by purpose
    • A01K2267/03Animal model, e.g. for test or diseases
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/01Fusion polypeptide containing a localisation/targetting motif
    • C07K2319/09Fusion polypeptide containing a localisation/targetting motif containing a nuclear localisation signal
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Biotechnology (AREA)
  • Medicinal Chemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Veterinary Medicine (AREA)
  • Animal Behavior & Ethology (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Public Health (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Epidemiology (AREA)
  • Plant Pathology (AREA)
  • Biophysics (AREA)
  • Oncology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Environmental Sciences (AREA)
  • Communicable Diseases (AREA)
  • Virology (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Animal Husbandry (AREA)
  • Optics & Photonics (AREA)
  • Analytical Chemistry (AREA)
  • General Physics & Mathematics (AREA)

Abstract

The application belongs to the technical field of biology, and relates to an editing system, method and application for realizing A-G base substitution efficiently and specifically. The fusion protein sequentially comprises a first SaCas9 nickase fragment, a chimeric deaminase fragment and a second SaCas9 nickase fragment from the N end to the C end; the deaminase is selected from adenosine deaminase or a variant thereof, the adenosine deaminase is selected from ecTadA8e, the amino acid sequence of the ecTadA8e is shown as SEQ ID No.20, or the deaminase has more than 80% sequence identity with the amino acid sequence shown as SEQ ID No.20, and has the function or activity of the ecTadA8 e. The fusion protein provided by the application combines with corresponding guide RNA, can efficiently and specifically replace a base A in a target site with G, provides an effective tool for repairing pathogenic mutation, researching gene functions, improving cell functions and the like, and has good application prospect.

Description

Fusion protein, efficient specific base editing system containing same and application
Technical Field
The specification relates to the biotechnology field, and relates to a high-efficiency specific base editing system and application.
Background
The CRISPR/Cas9 system mediated gene editing technology has the characteristics of simplicity, high efficiency, universality and the like, and has become the most widely applied gene editing research tool. The working principle of the CRSIPR/Cas9 system is that Cas9 protein with DNA double-strand cutting activity is used for carrying out DNA double-strand cutting on a target site under the guidance of specific gRNA, and mutations such as insertion, deletion, replacement and the like of fragments or bases are introduced through the repair of a cellular mechanism. The developed base editor is a novel gene editing tool, and by fusing the inactivated or partially inactivated Cas protein with deaminase or reverse transcriptase, the base conversion can be catalyzed efficiently without generating DNA double strand breaks and without a donor DNA template, so that the base editor is expected to be used in germplasm improvement and gene therapy. Currently common base editors are cytosine base editors (Cytosine base editor, CBE), adenine base editors (Adenine base editor, ABE), GBE base editors (Guanine base editor, GBE) and guide editors (PE), the former three corresponding to the realization of C-T base substitution, a-G base substitution and C-G base transversion, the latter being capable of realizing specific base transitions and small fragment sequence insertions or deletions.
Base editing techniques were originally developed by the David r.liu team at harvard university, which developed primary base editors CBE and ABE by fusing Cas9 enzyme activity mutant protein (nicase Cas 9) with cytosine deaminase or adenine deaminase, respectively. Due to activation of the base excision repair pathway, CBE, in addition to inducing C to T base conversion, also produces non-T byproducts and Indels (insertion and deletion events) to some extent. Subsequent studies reported the presence of severe random off-target phenomena in CBE-treated cells and embryos, whereas unlike CBEs, first generation ABEs (like ABE 7.10) did not induce significant Indels, and ABE7.10 rarely induced Cas 9-independent DNA off-target editing. These excellent properties make ABEs advantageous in future clinical applications. There are studies on obtaining higher activity of ABE8e by molecular evolution (ecTadA 8e as deaminase used), but Cas 9-independent DNA and RNA off-targeting problems are severe. Studies have revealed that the reason why ABE8e is prone to off-target is that deaminase ecTadA8e linked to Cas9 is always active, allowing Cas9 to continue to bind DNA for cleavage before a predetermined target is searched. Means that have been reported are mutation of deaminase ecTadA8e, regulation of deaminase expression, optimization of delivery means for editors, etc., but it is still impossible to completely avoid off-target events. This creates a significant unsafe factor for the clinical application of the tool.
In summary, how to develop a high-efficiency specific base editor that can induce base substitution of a target sequence with high efficiency and can not induce off-target editing is one of the problems in the field of gene editing that needs to be solved.
Disclosure of Invention
Based on this, the present invention aims to provide a fusion protein and a base editing system and application containing the fusion protein, wherein the fusion protein is designed by using a nickase of SaCas9, an Adenine deaminase ecABE8e, a flexible connection sequence linker and a nuclear localization signal BPNLS sequence, the fusion protein can form a complex with a guide RNA (sgRNA) to form a base editing system, and the sgRNA can guide the fusion protein to specifically recognize and cleave a target sequence and carry out base editing from Adenine (A) to Guanine (G). The system for editing the base of the invention is used for editing the cell gene, has remarkable efficiency of base substitution from A to G, almost has no off-target editing event of DNA and RNA, and is an efficient and specific adenine base editor. This facilitates the clinical safe use of base editors for gene and cell therapy.
The application provides a fusion protein, which sequentially comprises a first SaCas9 nickase fragment, a chimeric deaminase fragment and a second SaCas9 nickase fragment from the N end to the C end; the deaminase is selected from adenosine deaminase or a variant thereof, the adenosine deaminase is selected from ecTadA8e, the amino acid sequence of the ecTadA8e is shown as SEQ ID No.20, or the deaminase has more than 80% sequence identity with the amino acid sequence shown as SEQ ID No.20, and has the function or activity of the ecTadA8 e.
The present application also provides an isolated polynucleotide encoding the fusion protein described above.
The present application also provides an expression vector comprising the isolated polynucleotide described above.
The present application also provides an expression system comprising the above expression vector or the polynucleotide described above integrated with an exogenous source in the genome.
The present application also provides a base editing system comprising the fusion protein or the encoding polynucleotide thereof.
The application also provides the use of the fusion protein, the isolated polynucleotide, the expression vector, the expression system or the base editing system in gene editing.
The application also provides a gene editing method comprising the following steps: the target sequence is base-edited by the fusion protein described above, the isolated polynucleotide described above, the expression vector described above, or the expression system described above, or the base editing system described above.
The application also provides a cell with the base editing function, and the cell is obtained by carrying out gene editing on target sequences from A to G or from T to C by adopting the gene editing method.
The present application also provides a reporting system comprising a nucleotide sequence as set forth in SEQ ID NO. 15.
The present application also provides the use of the above-described reporter system for detecting the A-G editing efficiency of the above-described fusion protein, isolated polynucleotide, expression vector or expression system or base editing system.
The application also provides a method for detecting the A-G editing efficiency of the base editing product, and the report system is used for detecting the A-G editing efficiency of the product to be detected.
Benefits provided by embodiments of the present description include, but are not limited to: (1) The application provides a novel fusion protein with a gene editing function and a base editing system containing the same, which are characterized by high editing efficiency and high specificity in vitro and in vivo; (2) The fusion protein and the base editing system containing the same are used for at least any one of correction of pathogenic sites, gene function research, enhancement of cell functions and cell treatment; (3) The base editing system provided by the application can realize A-G editing in mammalian cells, the highest base replacement efficiency reaches 90%, and meanwhile, the high-efficiency A-G editing can be realized in the mammalian cells, and the highest base replacement efficiency reaches 63%.
Drawings
The present application will be further illustrated by way of example embodiments, which will be described in detail with reference to the accompanying drawings. These embodiments are not limiting, wherein:
FIG. 1 is a schematic diagram of a fluorescence reporting system according to some embodiments of the present application;
FIG. 2 is a bar graph of BFP fluorescence ratios at various times after SAABE8e transfection of cells according to some embodiments of the present application;
FIG. 3 is a flow chart illustrating random insertion of ecTadA8e into the middle of a SaCas9 nickase (nSaCas 9) protein using a transposase according to some embodiments of the present application;
FIG. 4 is a bar graph of editing efficiency of a reporting system by an editor designing different chimeric sites according to some embodiments of the present application;
FIG. 5A is a schematic diagram of the structure of a SaABE8e base editing protein according to some embodiments of the present application;
FIG. 5B is a schematic representation of the structure of a chimeric CE-SAABE8e base editing protein according to some embodiments of the present application;
FIG. 6 is an analysis of editing efficiency of 5 sites of HEK293T cells with a CE-SAABE8e base editing protein according to some embodiments of the present application;
FIG. 7A is a DNA-level off-target analysis of a CE-SAABE8e base editing protein according to some embodiments of the present application;
FIG. 7B is a schematic representation of a CE-SAABE8e base editing protein off-target at the RNA level according to some embodiments of the present application;
FIG. 8 is an analysis of editing efficiency of a CE-SAABE8e base editing protein on 5 endogenous gene loci of a mouse according to some embodiments of the present application.
Detailed Description
In order to more clearly illustrate the technical solutions of the embodiments of the present specification, the drawings that are required to be used in the description of the embodiments will be briefly described below. It is apparent that the drawings in the following description are only some examples or embodiments of the present specification, and it is possible for those of ordinary skill in the art to apply the present specification to other similar situations according to the drawings without inventive effort. Unless otherwise apparent from the context of the language or otherwise specified, like reference numerals in the figures refer to like structures or operations.
As used in this specification and the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.
A flowchart is used in this specification to describe the operations performed by the system according to embodiments of the present specification. It should be appreciated that the preceding or following operations are not necessarily performed in order precisely. Rather, the steps may be processed in reverse order or simultaneously. Also, other operations may be added to or removed from these processes.
The application provides a fusion protein, which sequentially comprises a first SaCas9 nickase fragment, a chimeric deaminase fragment and a second SaCas9 nickase fragment from the N end to the C end; the deaminase is selected from adenosine deaminase or a variant thereof, the adenosine deaminase is selected from ecTadA8e, the amino acid sequence of the ecTadA8e is shown as SEQ ID No.20, or the deaminase has more than 80% sequence identity with the amino acid sequence shown as SEQ ID No.20, and has the function or activity of the ecTadA8 e.
In some embodiments, the fusion protein may be a chimeric protein of a SaCas9 nickase and deaminase fragment. In some embodiments, the chimeric site of the deaminase fragment may be selected from the group consisting of positions 730-744 of the SaCas9 nickase amino acid sequence. In some embodiments, preferably, the chimeric site of the deaminase fragment may be selected from positions 733, 736, 739 or 744 of the SaCas9 nickase amino acid sequence.
"sequence" in this context is generally understood to include both the relevant amino acid sequence and the nucleic acid sequence or nucleotide sequence encoding the amino acid sequence, unless a more defined interpretation is required herein.
"sequence identity" between two polypeptide sequences indicates the percentage of identical amino acids between the sequences. "sequence identity" indicates the percentage of amino acids that are identical or that represent conservative amino acid substitutions. Methods for assessing the degree of sequence identity between amino acids or nucleotides are known to those skilled in the art. For example, amino acid sequence identity is typically measured using sequence analysis software. For example, the BLAST program of the NCBI database may be used to determine identity.
As used herein, the terms "polynucleotide," "nucleotide," "oligonucleotide," and "nucleic acid" are used interchangeably to refer to a nucleic acid comprising DNA, RNA, derivatives thereof, or combinations thereof.
In some embodiments, the fusion protein may further comprise a nuclear localization signal fragment located at the N-terminus and/or the C-terminus of the fusion protein. In some embodiments, the nuclear localization signal fragment may be an optimized nuclear localization signal (BPNLS) or a variant thereof. In some embodiments, the amino acid sequence of the BPNLS may be as shown in SEQ ID No. 17. In some embodiments, the amino acid sequence of the variant may have more than 80% sequence identity to BPNLS and function as BPNLS.
The term "nuclear localization signal" (NLS) refers to an amino acid sequence that induces the transport of a molecule comprising or linked to such a sequence into the nucleus of a eukaryotic cell. The nuclear localization signal may form part of the molecule to be transported. In some embodiments, the NLS may be attached to the remainder of the molecule by covalent bonds, hydrogen bonds, or ionic interactions. In some embodiments, the nuclear localization signal may aid in entry of the fusion protein into the nucleus.
In some embodiments, the fusion protein may further comprise a first flexible connecting peptide fragment and a second flexible connecting peptide fragment. In some embodiments, the first flexible linker peptide fragment may be located between the first SaCas9 nickase fragment and the chimeric deaminase fragment. In some embodiments, the second flexible linker peptide fragment can be located between the chimeric deaminase fragment and the second SaCas9 nickase fragment.
In some embodiments, the amino acid sequence of the first flexible connecting peptide fragment may be as shown in SEQ ID No. 18. In some embodiments, the amino acid sequence of the first flexible connecting peptide fragment may have more than 80% sequence identity to the amino acid sequence shown in SEQ ID No.18 and function or activity of the first flexible connecting peptide fragment.
In some embodiments, the amino acid sequence of the second flexible connecting peptide fragment may be as shown in SEQ ID No.19, and in some embodiments, the amino acid sequence of the second flexible connecting peptide fragment may have more than 80% sequence identity to the amino acid sequence shown in SEQ ID No.19 and function or activity as a second flexible connecting peptide fragment.
In some embodiments, the amino acid sequence of the fusion protein may include a fragment as set forth in any one of SEQ ID NOS.10-13, or an amino acid sequence having greater than 80% sequence identity to one of the amino acid sequences set forth in SEQ ID NOS.10-13 and having the function of the amino acid sequence defined in SEQ ID NOS.10-13.
The present application also provides an isolated polynucleotide encoding the fusion protein described above.
A polynucleotide refers to a polymer of nucleotides that are typically linked from one deoxyribose or ribose to another, and depending on the context, refers to DNA as well as RNA. Polynucleotides in the present application do not comprise any size limitation and also include polynucleotides comprising modifications, in particular comprising modified nucleotides. In some embodiments, the polynucleotide may be RNA, DNA, cDNA, or the like. Methods for providing such isolated polynucleotides should be known to those skilled in the art, and may be obtained, for example, by automated DNA synthesis and/or recombinant DNA techniques, etc., or may be isolated from suitable natural sources.
The present application also provides an expression vector, which may contain the isolated polynucleotide described above.
"vector" as used herein refers to a polynucleotide capable of carrying at least one polynucleotide fragment. The vector may deliver a fragment of the nucleic acid, each polynucleotide, into a host cell. It may comprise at least one expression cassette comprising regulatory sequences for the correct expression of the polynucleotide incorporated therein. Polynucleotides to be introduced into a cell (e.g., polynucleotides encoding a product of interest or a selectable marker) may be inserted into an expression cassette of a vector for expression therefrom. Vectors according to the present application may exist in circular or linear (linearized) form and also include vector fragments. The term "vector" also encompasses artificial chromosomes or similar individual polynucleotides that permit transfer of exogenous nucleic acid fragments.
The present application also provides an expression system comprising the above expression vector or the polynucleotide described above integrated with an exogenous source in the genome.
The expression system may be a host cell, which may be a prokaryotic cell, such as a bacterial cell; or lower eukaryotic cells such as yeast cells, filamentous fungal cells; or higher eukaryotic cells, such as mammalian cells. Representative examples are: coli, streptomyces; bacterial cells of salmonella typhimurium; fungal cells such as yeast, filamentous fungi, plant cells; insect cells of Drosophila S2 or Sf 9; CHO, COS, 293 cells, or Bowes melanoma cells. Methods for introducing expression vectors into host cells are known to those skilled in the art and may be, for example, microinjection, particle gun, electroporation, virus-mediated transformation, electron bombardment, calcium phosphate precipitation, and the like. The choice of expression system depends on a variety of factors including cell growth characteristics, expression levels, intracellular and extracellular expression, post-translational modification and biological cleanliness of the protein of interest, as well as regulatory issues and economic considerations in the production of therapeutic proteins.
In some embodiments, the host cell of the expression system may be selected from eukaryotic cells or prokaryotic cells. In some embodiments, preferably, the host cell may be selected from a mouse cell, a human cell.
The present application also provides a base editing system comprising the fusion protein or the encoding polynucleotide thereof.
In some embodiments, the base editing system may further comprise a guide RNA. In some embodiments, the guide RNA can target a target sequence.
The term "guide RNA" refers to an RNA molecule capable of directing a CRISPR effector with nuclease activity to target and cleave a specified target nucleic acid.
In some embodiments, the base editing system can comprise one or more vectors. In some embodiments, the one or more vectors may comprise a first regulatory element and a second regulatory element. In some embodiments, the first regulatory element is operably linked to the polynucleotide encoding the fusion protein. In some embodiments, the second regulatory element is operably linked to the encoding polynucleotide of the guide RNA nucleotide sequence. In some embodiments, the first regulatory element and the second regulatory element may be located on the same or different vectors.
In some embodiments, the base editing system can comprise (i) a fusion protein, and (ii) a guide RNA or a vector comprising the guide RNA encoding polynucleotide.
The application also provides the use of the fusion protein, the isolated polynucleotide, the expression vector, the expression system or the base editing system in gene editing.
In some embodiments, the gene editing may effect base substitution. In some embodiments, the gene editing may implement a replacement of a to G or a replacement of T to C. In some embodiments, the gene editing may be used to achieve at least one of correction of pathogenic sites, gene function studies, enhancement of cellular function, cell therapy. In some embodiments, the fusion protein, isolated polynucleotide, expression vector, expression system, or base editing system may be used in combination with other drugs or agents. In some embodiments, the disease caused by the treatment site may be selected from at least any one of the following: autoimmune diseases, tumors, viral infectious diseases, bacterial infectious diseases.
In some embodiments, the autoimmune disease may include, but is not limited to: systemic lupus erythematosus, rheumatoid arthritis, systemic vasculitis, scleroderma, pemphigus, dermatomyositis, mixed connective tissue disease, autoimmune hemolytic anemia, and the like.
In some embodiments, the tumor may include, but is not limited to: ewing's sarcoma (Ewing's sarcoma), neuroendocrine tumor, glioblastoma, neuroblastoma, melanoma, skin cancer, breast cancer, colon cancer, rectal cancer, prostate cancer, liver cancer, kidney cancer, pancreatic cancer, lung cancer, biliary tract cancer, cervical cancer, endometrial cancer, esophageal cancer, gastric cancer, head and neck cancer, medullary thyroid cancer, ovarian cancer, glioma, lymphoma, leukemia, myeloma, acute lymphoblastic leukemia, acute myelogenous leukemia, chronic lymphoblastic leukemia, chronic myelogenous leukemia, hodgkin's lymphoma, non-hodgkin's lymphoma, or urinary bladder cancer, and the like.
In some embodiments, the viral infectious disease may include, but is not limited to: measles, rubella, parotitis, varicella, AIDS, condyloma acuminatum, viral hepatitis, etc. In some embodiments, the bacterial infectious disease may include, but is not limited to: tuberculosis, acute tonsillitis, bacillary dysentery, suppurative meningitis, scarlet fever, acute pharyngolaryngitis, etc.
The application also provides a gene editing method comprising the following steps: the target sequence is base edited by the fusion protein, the isolated polynucleotide, the expression vector or the expression system or the base editing system.
In some embodiments, the methods may be performed in vitro. In some embodiments, preferably, the method may be performed in cultured cells. In some embodiments, the method is performed in vivo; preferably, the method is carried out in a mammal. In some embodiments, it is further preferred that the method can be practiced in rodents or primates. In some embodiments, it is further preferred that the method can be practiced in a mouse or in a human.
The application also provides a cell with the base editing function, and the cell is obtained by carrying out gene editing on target sequences from A to G or from T to C by adopting the gene editing method.
The present application also provides a reporter system that may comprise a nucleotide sequence as set forth in SEQ ID NO. 15.
In some embodiments, the reporting system may comprise a nucleotide sequence as set forth in SEQ ID No. 15. In some embodiments, the reporting system may display green fluorescence. In some embodiments, the reporter system may exhibit blue fluorescence when the nucleotide sequence shown in SEQ ID NO.15 is mutated to SEQ ID NO. 16. In some embodiments, the reporter system may not exhibit fluorescence when the nucleotide sequence shown in SEQ ID NO.15 is mutated to SEQ ID NO. 14.
In some embodiments, the reporting system may comprise a plasmid. In some embodiments, the plasmid may comprise a nucleotide sequence as set forth in SEQ ID No. 15. In some embodiments, the reporting system may further comprise a guide RNA or a vector comprising the guide RNA encoding polynucleotide sequence. In some embodiments, the guide RNA may target the codon sequence of amino acid 66 of the reporter protein encoded by SEQ ID No. 15. In some embodiments, preferably, the target sequence of the guide RNA may be as shown in SEQ ID No. 5.
The present application also provides the use of the above-described reporter system for detecting the A-G editing efficiency of the above-described fusion protein, isolated polynucleotide, expression vector or expression system or base editing system.
The application also provides a method for detecting the A-G editing efficiency of the base editing product, and the report system is used for detecting the A-G editing efficiency of the product to be detected.
The experimental methods in the following examples are conventional methods unless otherwise specified. The test materials used in the examples described below, unless otherwise specified, were purchased from conventional Biochemical reagent companies. The quantitative tests in the following examples were all set up in triplicate and the results averaged.
Example 1: fluorescent reporting system capable of testing A-G editing efficiency
In a mammalian cell line, the availability report system is subjected to base editing by a base substitution editor, and the presence or absence of editing and the level of efficiency are determined by a fluorescent signal. The specific implementation is as follows:
1. construction of a reporting system.
The coding nucleotide sequence of the reporter protein expressed by the reporter system (the working mode is shown as figure 1) is shown as SEQ ID NO.15, and the amino acid sequence of the reporter protein is shown as SEQ ID NO. 2. Expressing tyrosine when the 66 th amino acid codon on the coding strand of the coding nucleotide of the reporter protein is TAC (corresponding to ATG on a non-coding strand), and displaying green fluorescence by the reporter protein coded by the reporter system; when a base editing system is used for targeting a non-coding chain of the reporter protein, A to G mutation is introduced, ATG on the non-coding chain is mutated into GTG, and accordingly, a codon of 66 th amino acid is mutated into CAC, histidine is expressed, the reporter system shows blue fluorescence, at the moment, the coding nucleotide sequence of the reporter protein is changed to be shown as SEQ ID NO.16, and the amino acid sequence of the reporter protein is changed to be shown as SEQ ID NO. 1; when a mutation is introduced by base editing to mutate the codon of the 66 th amino acid into TAG, a stop codon is expressed, the report system does not display fluorescence, the coding nucleotide sequence of the report protein is changed to be shown as SEQ ID NO.14, and the amino acid sequence of the report protein is changed to be shown as SEQ ID NO. 3. The ratio of blue fluorescence was analyzed by flow cytometry, and the ratio of substitution of A-G bases was deduced. The nucleotide sequence of the vector plasmid (reporting system) containing the reporter protein constructed in the embodiment is shown as SEQ ID NO.39, and the corresponding reporting system is named as BFP-AG reporting system.
SEQ ID NO.15:
ATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGG
ACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCAC
CTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAACTGCCCGTGCCCTGGC
CCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCAC
ATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCA
CCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGG
CGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAAC
ATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGA
CAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGC
AGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGC
TGCTGCCCGACAACCACTACCTGAGCACCCAGTCCAAGCTGAGCAAAGACCCCAACGA
GAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCA
TGGACGAGCTGTACAAGTGA
SEQ ID NO.2:
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTL
VTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLV
NRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADH
YQQNTPIGDGPVLLPDNHYLSTQSKLSKDPNEKRDHMVLLEFVTAAGITLGMDELYKSEQ ID NO.16:
ATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGG
ACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCAC
CTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAACTGCCCGTGCCCTGGC
CCACCCTCGTGACCACCCTGACCCACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCAC
ATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCA
CCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGG
CGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAAC
ATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGA
CAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGC
AGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGC
TGCTGCCCGACAACCACTACCTGAGCACCCAGTCCAAGCTGAGCAAAGACCCCAACGA
GAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCA
TGGACGAGCTGTACAAGTGA
SEQ ID NO.1:
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTL
VTTLTHGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLV
NRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADH
YQQNTPIGDGPVLLPDNHYLSTQSKLSKDPNEKRDHMVLLEFVTAAGITLGMDELYKSEQ ID NO.14:
ATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGG
ACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCAC
CTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAACTGCCCGTGCCCTGGC
CCACCCTCGTGACCACCCTGACCTAGGGCGTGCAGTGCTTCAGCCGCTACCCCGACCAC
ATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCA
CCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGG
CGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAAC
ATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGA
CAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGC
AGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGC
TGCTGCCCGACAACCACTACCTGAGCACCCAGTCCAAGCTGAGCAAAGACCCCAACGA
GAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCA
TGGACGAGCTGTACAAGTGA
SEQ ID NO.3:
MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTL VTTLT x GVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLV NRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADH YQQNTPIGDGPVLLPDNHYLSTQSKLSKDPNEKRDHMVLLEFVTAAGITLGMDELYK (representing the stop codon)
SEQ ID NO.39:
GACGGATCGGGAGATCTCCCGATCCCCTATGGTGCACTCTCAGTACAATCTGCTCTGATG
CCGCATAGTTAAGCCAGTATCTGCTCCCTGCTTGTGTGTTGGAGGTCGCTGAGTAGTGCG
CGAGCAAAATTTAAGCTACAACAAGGCAAGGCTTGACCGACAATTGCATGAAGAATCTG
CTTAGGGTTAGGCGTTTTGCGCTGCTTCGCGATGTACGGGCCAGATATACGCGTTGACAT
TGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATAT
GGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACC
CCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCC
ATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGT
ATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATT
ATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCAT
CGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTTTGA
CTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACC
AAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGC
GGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCTCTGGCTAACTAGAGAACC
CACTGCTTACTGGCTTATCGAAATTAATACGACTCACTATAGGGAGACCCAAGCTGGCTA
GCATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCT
GGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCC
ACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAACTGCCCGTGCCCTG
GCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACC
ACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCG
CACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAG
GGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCA
ACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCC
GACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACG
GCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGT
GCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCAAGCTGAGCAAAGACCCCAAC
GAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGG
CATGGACGAGCTGTACAAGTGAAAGCTTGGTACCGAGCTCGGATCCACTAGTCCAGTGT
GGTGGAATTCTGCAGATATCCAGCACAGTGGCGGCCGCTCGAGTCTAGAGGGCCCGTTT
AAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCT
CCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATG
AGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGG
CAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGG
GCTCTATGGCTTCTGAGGCGGAAAGAACCAGCTGGGGCTCTAGGGGGTATCCCCACGCG
CCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTAC
ACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTT
CGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGC
TTTACGGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCAT
CGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGAC
TCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGG
GATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGC
GAATTAATTCTGTGGAATGTGTGTCAGTTAGGGTGTGGAAAGTCCCCAGGCTCCCCAGC
AGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCAGGTGTGGAAAGTCCC
CAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTAGTCAGCAACCATA
GTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCG
CCCCATGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCTGCCTCTGAG
CTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTCCC
GGGAGCTTGTATATCCATTTTCGGATCTGATCAAGAGACAGGATGAGGATCGTTTCGCAT
GATTGAACAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGGCTATTCG
GCTATGACTGGGCACAACAGACAATCGGCTGCTCTGATGCCGCCGTGTTCCGGCTGTCA
GCGCAGGGGCGCCCGGTTCTTTTTGTCAAGACCGACCTGTCCGGTGCCCTGAATGAACT
GCAGGACGAGGCAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAGCT
GTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTATTGGGCGAAGTGCCGG
GGCAGGATCTCCTGTCATCTCACCTTGCTCCTGCCGAGAAAGTATCCATCATGGCTGATG
CAATGCGGCGGCTGCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAAA
CATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGTCGATCAGGATGATCT
GGACGAAGAGCATCAGGGGCTCGCGCCAGCCGAACTGTTCGCCAGGCTCAAGGCGCGC
ATGCCCGACGGCGAGGATCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCATG
GTGGAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTGTGGCGGACCG
CTATCAGGACATAGCGTTGGCTACCCGTGATATTGCTGAAGAGCTTGGCGGCGAATGGG
CTGACCGCTTCCTCGTGCTTTACGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTA
TCGCCTTCTTGACGAGTTCTTCTGAGCGGGACTCTGGGGTTCGAAATGACCGACCAAGC
GACGCCCAACCTGCCATCACGAGATTTCGATTCCACCGCCGCCTTCTATGAAAGGTTGG
GCTTCGGAATCGTTTTCCGGGACGCCGGCTGGATGATCCTCCAGCGCGGGGATCTCATG
CTGGAGTTCTTCGCCCACCCCAACTTGTTTATTGCAGCTTATAATGGTTACAAATAAAGC
AATAGCATCACAAATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGT
CCAAACTCATCAATGTATCTTATCATGTCTGTATACCGTCGACCTCTAGCTAGAGCTTGGC
GTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAAC
ATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCAC
ATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCA
TTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTT
CCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCAC
TCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTG
AGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTC
CATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGC
GAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGC
TCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGC
GTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCC
AAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAA
CTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTG
GTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGG
CCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTT
ACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCG
GTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCT
TTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTT
GGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTT
AAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTG
AGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGT
GTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCG
AGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCC
GAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGG
GAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACA
GGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGA
TCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCC
TCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACT
GCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCA
ACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATA
CGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCT
TCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCAC
TCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAA
AACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAAT
ACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCG
GATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCC
GAAAAGTGCCACCTGACGTC
2. Construction of sgRNA expression vectors
An sgRNA vector of a specific targeting reporter protein is constructed by taking an AAV-EFS-SaABE8e-bGH-U6-sgRNA-BsmBI vector (adedge: 189922) as a basic vector, wherein the vector comprises an expression element of a reported SaABE8e protein (the protein structure is shown in figure 5A) and guide RNA thereof. For convenience, the BsmBI cleavage site on the vector is replaced by a common BsaI cleavage site sequence to obtain the AAV-EFS-SaABE8e-bGH-U6-sgRNA-BsaI vector.
According to the SaCas9 design principle, aiming at an editing target, namely the 66 th amino acid codon of the report protein, a 22nt targeting sequence is designed as SEQ ID NO.5: CGCCGTGGGTCAGGGTGGTCAC, constructing a corresponding sgRNA expression vector, wherein the construction process is as follows:
the oligonucleotide pair with sticky ends is synthesized according to the target site sequence and shown as SEQ ID NO.6: CACCCGCCGTGGGTCAGGGTGGTCAC and SEQ ID NO.7: AAACCCGTAGGTCAGGGTGGTCAC it is shown that the primers are annealed to oligo duplex and ligated to the AAV-EFS-SaABE8e-bGH-U6-sgRNA-BsaI vector, which is tangentially digested with BsaI, to construct a specific targeted sgRNA vector, designated AAV-SaABE8e-gRNA-BFP. The experimental implementation process is as follows:
2.1 annealing to form oligo double strand
The annealing reaction system is as follows:
upstream primer (10. Mu.M) 20μL
Downstream primer (10. Mu.M) 20μL
The annealing procedure is as follows: 95 ℃ for 5min,95-85 ℃ to 2 ℃/s,85-25 ℃ to 0.1 ℃/s and 4 ℃.
2.2BsaI digestion to obtain linearized vector
The cleavage reaction system is as follows:
AAV-EFS-SaABE8e-bGH-U6-sgRNA-BsaI plasmid 2μg
10×Cutsmart buffer 5μL
BsaI enzyme (NEB: R0539L) 5μL
Adding water to 50μL
After the above system is prepared, the mixture is placed at 37 ℃ for reaction for 3 hours, 2 mu L of agarose gel electrophoresis is taken to verify that the carrier is completely linearized, and an enzyme digestion product of which the carrier is completely linearized is recovered by using an AxyPrep PCR recovery kit (Axygen company, code No. AP-PCR-250G, later the same) to obtain the linearized carrier.
2.3 ligation of annealed products with linearization Carrier
The annealed product obtained in 2.1 and the linearized support obtained in 2.2 were ligated using a DNA Ligation Kit Ver.2.1 Kit (Takara Co., code No.: 6022Q), and the reaction system was as follows:
Figure BDA0004104029210000141
Figure BDA0004104029210000151
after the above system was prepared, DH 5. Alpha. Competent cells were transformed after incubation at 16℃for 30 minutes, and after resuscitative on a shaker at 37℃for 30 minutes, plates were plated on ampicillin-resistant LB agar plates and incubated in an incubator at 37℃overnight with inversion. The next day, the monoclonal was selected for first generation sequencing validation. After successful ligation and sequence error, plasmid extraction was performed.
3. Mammalian cell lines were transfected with the A-G editing system and the acquisition reporting system.
HEK293T cells were inoculated and cultured in DMEM medium containing 10% FBS (Hyclone, code No.: SH30022.01B, supra). Cells were split into 24-well plates the day prior to transfection. The next day, transfection was performed until the density reached 70% -80%. Fresh medium was changed two hours prior to transfection. According to the operating manual of EZTrans cell transfection fluid (Code No. AC04L091, later on), 900ng of AAV-SaABE8e-gRNA-BFP base editing vector plasmid and 100ng of BFP-AG reporter system expression vector plasmid were mixed, co-transfected into cells, and after 6-8 hours, fluid was changed and fluorescence detection was performed by flow cytometry after 24, 48, 72, 96 hours, respectively.
4. Fluorescence reporting system analysis of base substitution efficiency
Analysis of flow cytometry results using flowjo software, the present inventors found that AAV-SaABE8e-gRNA-BFP could achieve GFP to BFP fluorescence conversion (FIG. 2). The fluorescence intensity was highest at 48 hours, and the subsequent experiments will detect the fluorescence intensity at 48 hours as a time point.
The results show that the report system constructed by the invention can more accurately reflect the A-G editing efficiency and can be used as a method for testing the base editing efficiency of an adenine base editor.
Example 2: construction of chimeric CE-SaABE8e
Studies have shown that SaABE8e, which fuses deaminase at the N-terminus, is prone to random off-targeting of DNA and RNA, which constitutes a great security threat to the use of base editors in vivo. Therefore, the invention contemplates that the efficient ecTadA8e deaminase is inserted into the protein structural domain of the SaCas9 nickase, so that the random deamination of the deaminase can be effectively avoided, and the effect of reducing the target removal is achieved.
1. Construction of pET-nSaCas9-SagRNA-AmpR (W163X) -KanR vector
By means of
Figure BDA0004104029210000152
II One Step Cloning Kit recombinant kit (Code No. C112-02, supra) the pET-nSaCas9-SagRNA-AmpR (W163X) -KanR vector was constructed, the nucleotide sequence of which is shown in SEQ ID No. 8. The ampicillin resistance gene on the vector contained a stop codon TAG (bolded in SEQ ID NO.8 sequence) at amino acid 163, and when TAG was edited to TGG by targeting A-G, the ampicillin resistance acted and the corresponding bacteria could grow on plates of ampicillin antibiotics.
SEQ ID NO.8:
AAACGCAATTATATCCTGGGCCTGGCTATCGGTATTACTTCTGTTGGTTACGGTATCATTGACTACGAAACTCGCGATGTGATCGATGCTGGTGTGCGCCTGTTCAAAGAAGCTAACGTAGAAAATAACGAAGGCCGTCGTTCTAAGCGCGGTGCACGTCGTCTGAAACGCCGTCGCCGTCACCGTATTCAGCGTGTGAAAAAACTGCTGTTCGATTACAACCTGCTGACCGATCATAGCGAACTGTCTGGCATCAACCCTTATGAAGCTCGTGTTAAAGGTCTGTCTCAGAAACTGAGCGAAGAAGAATTCTCCGCAGCGCTGCTGCACCTGGCTAAACGTCGCGGTGTCCATAACGTCAACGAAGTTGAAGAAGATACCGGCAATGAACTGTCCACTAAAGAACAGATCTCCCGTAATAGCAAAGCTCTGGAAGAAAAATATGTTGCTGAACTGCAGCTGGAACGCCTGAAAAAAGACGGCGAAGTTCGTGGTTCTATCAATCGTTTTAAAACCTCCGACTATGTAAAAGAAGCTAAACAGCTGCTGAAGGTTCAGAAAGCCTATCACCAGCTGGATCAGAGCTTCATTGACACTTACATCGACCTGCTGGAAACCCGTCGTACGTACTACGAAGGCCCGGGCGAAGGCTCTCCGTTCGGTTGGAAGGACATCAAAGAATGGTACGAGATGCTGATGGGTCACTGTACTTACTTCCCGGAAGAGCTGCGTAGCGTCAAATACGCTTATAACGCGGACCTGTACAACGCGCTGAATGATCTGAACAACCTGGTGATCACCCGCGATGAAAACGAAAAACTGGAATACTACGAAAAATTTCAAATCATCGAAAATGTCTTCAAACAGAAGAAAAAACCGACCCTGAAACAGATCGCAAAAGAGATTCTGGTCAATGAGGAGGATATTAAAGGCTACCGCGTTACCTCTACCGGTAAACCTGAGTTCACCAACCTGAAAGTATACCATGACATCAAGGACATCACCGCTCGTAAAGAGATTATCGAAAATGCAGAGCTGCTGGATCAAATCGCAAAAATCCTGACCATCTACCAGTCCTCTGAAGATATCCAGGAAGAGCTGACCAACCTGAACAGCGAACTGACTCAGGAAGAAATCGAACAGATTTCCAACCTGAAAGGTTACACCGGTACTCACAACCTGAGCCTGAAAGCGATCAACCTGATCCTGGACGAACTGTGGCACACTAACGACAACCAAATTGCAATCTTCAACCGTCTGAAACTGGTGCCAAAAAAAGTAGACCTGTCTCAGCAGAAAGAAATCCCGACCACCCTGGTGGATGACTTTATCCTGTCTCCTGTTGTGAAACGTTCTTTCATTCAGTCTATCAAAGTTATCAATGCCATCATCAAAAAATACGGTCTGCCGAATGATATTATTATTGAACTGGCGCGTGAAAAAAACTCCAAAGACGCACAGAAAATGATCAATGAAATGCAGAAACGTAACCGTCAGACCAATGAACGTATTGAAGAAATTATCCGTACCACCGGCAAAGAAAACGCAAAATATCTGATCGAGAAGATTAAGCTGCACGACATGCAGGAAGGCAAGTGTCTGTATAGCCTGGAGGCGATTCCACTGGAAGACCTGCTGAATAACCCTTTCAACTATGAAGTCGATCACATCATCCCTCGTTCTGTTTCCTTCGATAACTCCTTCAATAACAAGGTTCTGGTAAAACAGGAAGAAAATAGCAAAAAAGGTAACCGCACTCCATTCCAGTACCTGTCTTCTAGCGACTCCAAAATCTCTTACGAAACTTTCAAAAAACACATCCTGAATCTGGCGAAAGGTAAGGGCCGTATCAGCAAGACCAAAAAAGAATACCTGCTGGAAGAACGTGACATCAATCGTTTTTCCGTGCAGAAAGATTTCATCAACCGCAACCTGGTTGACACCCGTTATGCTACTCGTGGTCTGATGAACCTGCTGCGTTCTTATTTCCGTGTCAACAACCTGGACGTGAAAGTGAAATCCATTAACGGCGGTTTCACCTCTTTCCTGCGCCGTAAATGGAAATTCAAAAAAGAACGCAACAAGGGTTATAAACACCATGCTGAGGATGCTCTGATCATTGCTAACGCTGACTTCATCTTCAAAGAATGGAAAAAGCTGGATAAGGCTAAAAAAGTTATGGAAAATCAGATGTTCGAAGAAAAACAAGCGGAATCCATGCCGGAGATCGAAACCGAACAAGAGTATAAAGAGATCTTCATCACTCCGCATCAGATCAAACACATTAAAGATTTCAAAGACTATAAATACAGCCACCGTGTGGACAAAAAACCGAACCGTGAGCTGATCAACGATACCCTGTACAGCACTCGCAAAGATGACAAAGGTAACACCCTGATTGTTAACAACCTGAACGGTCTGTACGACAAGGATAATGATAAGCTGAAGAAACTGATCAATAAGAGCCCGGAAAAACTGCTGATGTATCACCATGATCCGCAGACCTACCAGAAACTGAAACTGATCATGGAACAGTACGGTGACGAAAAAAACCCGCTGTACAAATACTACGAAGAAACCGGTAACTATCTGACCAAATACAGCAAAAAAGACAACGGTCCAGTAATCAAAAAGATTAAGTACTACGGTAACAAACTGAATGCCCACCTGGACATCACCGACGACTACCCTAATTCCCGTAACAAGGTAGTTAAACTGTCTCTGAAACCGTACCGTTTTGATGTGTACCTGGATAACGGTGTTTACAAATTCGTGACCGTAAAAAACCTGGACGTGATCAAGAAAGAAAACTATTATGAGGTGAACTCTAAATGCTACGAAGAAGCTAAAAAACTGAAGAAGATCTCTAACCAGGCAGAATTCATCGCGTCTTTCTACAACAACGATCTGATCAAAATTAATGGTGAACTGTACCGTGTGATTGGCGTGAACAACGATCTGCTGAATCGTATTGAAGTAAACATGATTGACATCACCTACCGTGAATACCTGGAAAACATGAACGATAAACGTCCGCCTCGTATTATTAAGACCATCGCTTCTAAAACGCAGTCCATCAAAAAATACTCCACGGATATTCTGGGCAACTCTGGCGGCTCAAAAAGAACCGCCGACGGCAGCGAATTCGAGCCCAAGAAGAAGAGGAAAGTCTAACCGGTCATCATCACCATCACCATTGAGTTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGAACCAGCTGGGGCTCGTTGACAGCTAGCTCAGTCCTAGGTATAATACTAGTGAAACACCGGAGACCACGGCAGGTCTCAGTTTTAGTACTCTGTAATGAAAATTACAGAATCTACTAAAACAAGGCAAAATGCCGTGTTTATCTCGTCAACTTGTTGGCGAGATTTTTTTGATCCGGCTGCTAACAAAGCCCGAAAGGAAGCTGAGTTGGCTGCTGCCACCGCTGAGCAATAACTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGCTGAAAGGAGGAACTATATCCGGATTGGCGAATGGGACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTTTAACAAAATATTAACGTTTACAATTTCAGGTGGCACTTTTCGGGGAAATGTGGGAAATGTGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGAGTATTCAACATTTCCGTGTCGCCCTTATTCCCTTTTTTGCGGCATTTTGCCTTCCTGTTTTTGCTCACCCAGAAACGCTGGTGAAAGTAAAAGATGCTGAAGATCAGTTGGGTGCACGAGTGGGTTACATCGAACTGGATCTCAACAGCGGTAAGATCCTTGAGAGTTTTCGCCCCGAAGAACGTTTTCCAATGATGAGCACTTTTAAAGTTCTGCTATGTGGCGCGGTATTATCCCGTATTGACGCCGGGCAAGAGCAACTCGGTCGCCGCATACACTATTCTCAGAATGACTTGGTTGAGTACTCACCAGTCACAGAAAAGCATCTTACGGATGGCATGACAGTAAGAGAATTATGCAGTGCTGCCATAACCATGAGTGATAACACTGCGGCCAACTTACTTCTGACAACGATCGGAGGACCGAAGGAGCTAACCGCTTTTTTGCACAACATGGGGGATCATGTAACTCGCCTTGATCGTTAGGAACCGGAGCTGAATGAAGCCATACCAAACGACGAGCGTGACACCACGATGCCTGTAGCAATGGCAACAACGTTGCGCAAACTATTAACTGGCGAACTACTTACTCTAGCTTCCCGGCAACAATTAATAGACTGGATGGAGGCGGATAAAGTTGCAGGACCACTTCTGCGCTCGGCCCTTCCGGCTGGCTGGTTTATTGCTGATAAATCTGGAGCCGGTGAGCGTGGAAGCCGCGGTATCATTGCAGCACTGGGGCCAGATGGTAAGCCCTCCCGTATCGTAGTTATCTACACGACGGGGAGTCAGGCAACTATGGATGAACGAAATAGACAGATCGCTGAGATAGGTGCCTCACTGATTAAGCATTGGTAAGCGCGGAACCCCTATTTGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTCATGAATTAATTCTTAGAAAAACTCATCGAGCATCAAATGAAACTGCAATTTATTCATATCAGGATTATCAATACCATATTTTTGAAAAAGCCGTTTCTGTAATGAAGGAGAAAACTCACCGAGGCAGTTCCATAGGATGGCAAGATCCTGGTATCGGTCTGCGATTCCGACTCGTCCAACATCAATACAACCTATTAATTTCCCCTCGTCAAAAATAAGGTTATCAAGTGAGAAATCACCATGAGTGACGACTGAATCCGGTGAGAATGGCAAAAGTTTATGCATTTCTTTCCAGACTTGTTCAACAGGCCAGCCATTACGCTCGTCATCAAAATCACTCGCATCAACCAAACCGTTATTCATTCGTGATTGCGCCTGAGCGAGACGAAATACGCGATCGCTGTTAAAAGGACAATTACAAACAGGAATCGAATGCAACCGGCGCAGGAACACTGCCAGCGCATCAACAATATTTTCACCTGAATCAGGATATTCTTCTAATACCTGGAATGCTGTTTTCCCGGGGATCGCAGTGGTGAGTAACCATGCATCATCAGGAGTACGGATAAAATGCTTGATGGTCGGAAGAGGCATAAATTCCGTCAGCCAGTTTAGTCTGACCATCTCATCTGTAACATCATTGGCAACGCTACCTTTGCCATGTTTCAGAAACAACTCTGGCGCATCGGGCTTCCCATACAATCGATAGATTGTCGCACCTGATTGCCCGACATTATCGCGAGCCCATTTATACCCATATAAATCAGCATCCATGTTGGAATTTAATCGCGGCCTAGAGCAAGACGTTTCCCGTTGAATATGGCTCATAACACCCCTTGTATTACTGTTTATGTAAGCAGACAGTTTTATTGTTCATGACCAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTATCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGAGCGAGGAAGCGGAAGAGCGCCTGATGCGGTATTTTCTCCTTACGCATCTGTGCGGTATTTCACACCGCATATATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGTATACACTCCGCTATCGCTACGTGACTGGGTCATGGCTGCGCCCCGACACCCGCCAACACCCGCTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTCACCGTCATCACCGAAACGCGCGAGGCAGCTGCGGTAAAGCTCATCAGCGTGGTCGTGAAGCGATTCACAGATGTCTGCCTGTTCATCCGCGTCCAGCTCGTTGAGTTTCTCCAGAAGCGTTAATGTCTGGCTTCTGATAAAGCGGGCCATGTTAAGGGCGGTTTTTTCCTGTTTGGTCACTGATGCCTCCGTGTAAGGGGGATTTCTGTTCATGGGGGTAATGATACCGATGAAACGAGAGAGGATGCTCACGATACGGGTTACTGATGATGAACATGCCCGGTTACTGGAACGTTGTGAGGGTAAACAACTGGCGGTATGGATGCGGCGGGACCAGAGAAAAATCACTCAGGGTCAATGCCAGCGCTTCGTTAATACAGATGTAGGTGTTCCACAGGGTAGCCAGCAGCATCCTGCGATGCAGATCCGGAACATAATGGTGCAGGGCGCTGACTTCCGCGTTTCCAGACTTTACGAAACACGGAAACCGAAGACCATTCATGTTGTTGCTCAGGTCGCAGACGTTTTGCAGCAGCAGTCGCTTCACGTTCGCTCGCGTATCGGTGATTCATTCTGCTAACCAGTAAGGCAACCCCGCCAGCCTAGCCGGGTCCTCAACGACAGGAGCACGATCATGCGCACCCGTGGGGCCGCCATGCCGGCGATAATGGCCTGCTTCTCGCCGAAACGTTTGGTGGCGGGACCAGTGACGAAGGCTTGAGCGAGGGCGTGCAAGATTCCGAATACCGCAAGCGACAGGCCGATCATCGTCGCGCTCCAGCGAAAGCGGTCCTCGCCGAAAATGACCCAGAGCGCTGCCGGCACCTGTCCTACGAGTTGCATGATAAAGAAGACAGTCATAAGTGCGGCGACGATAGTCATGCCCCGCGCCCACCGGAAGGAGCTGACTGGGTTGAAGGCTCTCAAGGGCATCGGTCGAGATCCCGGTGCCTAATGAGTGAGCTAACTTACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCCAGGGTGGTTTTTCTTTTCACCAGTGAGACGGGCAACAGCTGATTGCCCTTCACCGCCTGGCCCTGAGAGAGTTGCAGCAAGCGGTCCACGCTGGTTTGCCCCAGCAGGCGAAAATCCTGTTTGATGGTGGTTAACGGCGGGATATAACATGAGCTGTCTTCGGTATCGTCGTATCCCACTACCGAGATGTCCGCACCAACGCGCAGCCCGGACTCGGTAATGGCGCGCATTGCGCCCAGCGCCATCTGATCGTTGGCAACCAGCATCGCAGTGGGAACGATGCCCTCATTCAGCATTTGCATGGTTTGTTGAAAACCGGACATGGCACTCCAGTCGCCTTCCCGTTCCGCTATCGGCTGAATTTGATTGCGAGTGAGATATTTATGCCAGCCAGCCAGACGCAGACGCGCCGAGACAGAACTTAATGGGCCCGCTAACAGCGCGATTTGCTGGTGACCCAATGCGACCAGATGCTCCACGCCCAGTCGCGTACCGTCTTCATGGGAGAAAATAATACTGTTGATGGGTGTCTGGTCAGAGACATCAAGAAATAACGCCGGAACATTAGTGCAGGCAGCTTCCACAGCAATGGCATCCTGGTCATCCAGCGGATAGTTAATGATCAGCCCACTGACGCGTTGCGCGAGAAGATTGTGCACCGCCGCTTTACAGGCTTCGACGCCGCTTCGTTCTACCATCGACACCACCACGCTGGCACCCAGTTGATCGGCGCGAGATTTAATCGCCGCGACAATTTGCGACGGCGCGTGCAGGGCCAGACTGGAGGTGGCAACGCCAATCAGCAACGACTGTTTGCCCGCCAGTTGTTGTGCCACGCGGTTGGGAATGTAATTCAGCTCCGCCATCGCCGCTTCCACTTTTTCCCGCGTTTTCGCAGAAACGTGGCTGGCCTGGTTCACCACGCGGGAAACGGTCTGATAAGAGACACCGGCATACTCTGCGACATCGTATAACGTTACTGGTTTCACATTCACCACCCTGAATTGACTCTCTTCCGGGCGCTATCATGCCATACCGCGAAAGGTTTTGCGCCATTCGATGGTGTCCGGGATCTCGACGCTCTCCCTTATGCGACTCCTGCATTAGGAAGCAGCCCAGTAGTAGGTTGAGGCCGTTGAGCACCGCCGCCGCAAGGAATGGTGCATGCAAGGAGATGGCGCCCAACAGTCCCCCGGCCACGGGGCCTGCCACCATACCCACGCCGAAACAAGCGCTCATGAGCCCGAAGTGGCGAGCCCGATCTTCCCCATCGGTGATGTCGGCGATATAGGCGCCAGCAACCGCACCTGTGGCGCCGGTGATGCCGGCCACGATGCGTCCGGCGTAGAGGATCGAGATCGATCTCGATCCCGCGAAATTAATACGACTCACTATAGGGGAATTGTGAGCGGATAACAATTCCCCTCTAGAAATAATTTTGTTTAACTTTAAGAAGGAGATATACATGCCACCATGAAACGGACAGCCGACGGAAGCGAGTTCGAGTCACCAAAGAAGAAGCGGAAAGTC
2. Construction of randomly inserted recombinant vector plasmids Using MuA transposase
The fragment of the codon-optimized ecTadA8e gene of E.coli (nucleotide sequence shown as SEQ ID NO. 9) was synthesized in the Shanghai Co., ltd. The ecTadA8e fragment and pET-nSaCas9-SagRNA-AmpR (W163X) -KanR plasmid are randomly inserted into recombinant vector of ecTadA8e at different positions on pET-nSaCas9-SagRNA-AmpR (W163X) -KanR vector under the action of MuA transposase (Thermo Fisher, F-701). The specific reaction system is as follows:
Fragment ecTadA8e 250ng
pET-nSaCas9-SagRNA-AmpR (W163X) -KanR plasmid 500ng
MuA transposase 1μL
5×Reaction Buffer for MuA Transposase 4μL
Adding water to 20μL
The above mixed reaction solution was incubated at 30℃for 1 hour to effect random insertion, and then incubated at 75℃for 10 minutes to inactivate MuA transposase. The DNA was then purified by isopropanol precipitation and resuspended in 5. Mu.L deionized water and then transformed into 100. Mu.L BL21 (DE 3) competent cells.
SEQ ID NO.9:
TCTGAAGTAGAATTTTCCCACGAATACTGGATGCGCCATGCACTGACCCTGGCAAAACGCGCCCGCGACGAACGTGAAGTTCCAGTTGGTGCGGTGCTGGTACTGAACAACCGTGTAATCGGCGAAGGCTGGAATCGTGCGATCGGTCTGCACGATCCGACTGCACACGCAGAAATCATGGCTCTGCGTCAGGGTGGCCTGGTGATGCAAAATTACCGCCTGATCGATGCGACTCTGTATGTTACCTTCGAACCGTGCGTAATGTGTGCAGGTGCTATGATCCACTCCCGTATTGGTCGCGTCGTGTTTGGTGTTCGCAACTCCAAGCGTGGTGCTGCAGGCTCTCTGATGAACGTGCTGAACTACCCGGGCATGAACCATCGTGTTGAGATCACGGAAGGCATCCTGGCTGACGAATGTGCTGCCCTGCTGTGTGACTTCTACCGTATGCCGCGCCAGGTATTCAACGCCCAGAAGAAGGCGCAGAGCAGCATCAAC
3. Screening of expression plasmids of functionally inserted fusion proteins in E.coli
Only bacteria that normally express ecTadA8e are able to grow on ampicillin-resistant LB agar plates.
The transformed bacteria were resuscitated in SOC medium for 1 hour, plated on 3 LB agar plates containing 10. Mu.g/mL kanamycin, and incubated at 37℃for 16 hours. Since the vector pET-nSaCas9-SagRNA-AmpR (W163X) -KanR carries the normally expressed KanR gene, a large number of colonies were seen to grow since kanamycin resistance was seen for both the original vector and the recombinant vector. Colonies from all plates described above were scraped and resuspended in 100mLLB containing 500. Mu.M IPTG. The cultures were incubated for 10-12h to induce expression of the functional intercalating base editing fusion protein and repair mutations of the AmpR (W163X) gene on the vector. Reduced amounts of cells (5 mL,1mL, 500. Mu.L, 100. Mu.L) were then inoculated onto 15cm LB agar plates containing ampicillin (10. Mu.g/mL) and kanamycin (10. Mu.g/mL). After overnight incubation at 37 ℃, colonies were picked and Sanger sequenced to evaluate base editing on AmpR (W163X) and determine the ecTadA8e insertion site.
Based on Sanger sequencing analysis results, the amino acid positions of the ecTadA8e inserted on the SaCas9 nickase were selected to be amino acids 123, 128, 460, 665, 723, 730, 731, 732, 733, 734, 735, 736, 738, 739, 740, 741, 742, 743, 744, 755, 832, 901, 911, 912, 913, 953, which amino acid positions are abbreviated as "insertion sites" in the present application.
4. Expression vector construction of base editor carrying chimeric fusion protein
According to the insertion sites determined by the preliminary screening, the invention constructs a recombinant expression vector of a base editor expressed in mammals, wherein the vector not only carries chimeric fusion protein coding sequences and promoters and other regulatory element sequences thereof, but also comprises guide RNA (gRNA) and promoter sequences thereof. The construction method comprises the following steps:
firstly, according to different insertion sites, different chimeric fusion proteins with optimized mammalian codons and gene fragments of regulatory elements thereof (hereinafter referred to as CE fragments) are synthesized at the manufacturer, wherein the sequence from N end to C end is as follows: EFS promoter (nucleotide sequence as SEQ ID NO. 4), N-terminal BPNLS nuclear localization signal (amino acid sequence as SEQ ID NO. 17), first SaCas9 nickase fragment (26 sequences designed according to different insertion sites), first flexible ligation sequence (amino acid sequence as SEQ ID NO. 18), ecTadA8e deaminase fragment (amino acid sequence as SEQ ID NO. 20), second flexible ligation sequence (amino acid sequence as SEQ ID NO. 19), second SaCas9 nickase fragment (26 sequences designed according to different insertion sites), C-terminal BPNLS sequence (amino acid sequence as SEQ ID NO. 21), bGH-polyA sequence (well known in the art).
Meanwhile, an optimized gRNA (nucleotide sequence shown in SEQ ID NO. 22) and a gene fragment (hereinafter referred to as "gRNA fragment") of its human U6 promoter sequence (well known in the art) were synthesized.
SEQ ID NO.4:
GAATTCGCTAGCTAGGTCTTGAAAGGAGTGGGAATTGGCTCCGGTGCCCGTCAGTGGGCAGAGCGCACATCGCCCACAGTCCCCGAGAAGTTGGGGGGAGGGGTCGGCAATTGATCCGGTGCCTAGAGAAGGTGGCGCGGGGTAAACTGGGAAAGTGATGTCGTGTACTGGCTCCGCCTTTTTCCCGAGGGTGGGGGAGAACCGTATATAAGTGCAGTAGTCGCCGTGAACGTTCTTTTTCGCAACGGGTTTGCCGCCAGAACACAGGACCGGTGCCACC
SEQ ID NO.17:
MKRTADGSEFESPKKKRKV
SEQ ID NO.18:
SGSETPGTSESATPESGS
SEQ ID NO.19:
SGSGSETPGTSESATPES
SEQ ID NO.20:
SEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGEGWNRAIGLHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNSKRGAAGSLMNVLNYPGMNHRVEITEGILADECAALLCDFYRMPRQVFNAQKKAQSSIN
SEQ ID NO.21:
KRTADGSEFEPKKKRKV
SEQ ID NO.22:
GTTTTAGTACTCTGTAATGAAAATTACAGAATCTACTAAAACAAGGCAAAATGCCGT GTTTATCTCGTCAACTTGTTGGCGAGA
4.1 fragment amplification and recovery
Designing an upstream primer and a downstream primer, and respectively amplifying the synthesized genes (CE fragments and gRNA fragments of 26 insertion sites) serving as DNA templates to obtain 2 insertion fragments with overlapping sequences at two ends; then amplifying the vector with pX601-AAV (Addgene: # 61591) as a base vector to obtain a framework fragment of the vector (primers and corresponding templates are shown in the following table).
Figure BDA0004104029210000231
PCR amplification was performed according to the following PCR reaction system:
2X FastD Pfu Supermix (Whole gold, code No.: AS 231) 25μL
Forward primer (10. Mu.M)) 2.5μL
Reverse primer (10. Mu.M) 2.5μL
DNA template 5ng
Adding water to 50μL
The reaction conditions are as follows: denaturation at 95℃for 3min; denaturation at 95℃for 30s, renaturation at 60℃for 20s, extension at 72℃for 4min, 32 cycles of amplification; preserving heat at 72 ℃ for 5min; preserving heat at 4 ℃.
And (3) recovering the PCR products by using an AxyPrep PCR recovery kit to respectively obtain 26 different CE fragment recovery products 1, a gRNA fragment recovery product 2 and a carrier skeleton recovery product 3, and measuring the concentration.
4.3 fragment recombination
And (3) respectively recombining 26 recovered products 1 and recovered products 2 and 3 by using a recombination kit, wherein a recombination reaction system is as follows:
Recovery of product 1 25ng
Recovery of product 2 25ng
Recovery of product 3 50ng
5×CE II Buffer 2μL
Exnase II 1μL
Adding water to 10μL
After the above system was prepared, DH 5. Alpha. Competent cells were transformed after incubation at 16℃for 30 minutes, and after resuscitative on a shaker at 37℃for 30 minutes, plates were plated on ampicillin-resistant LB agar plates and incubated in an incubator at 37℃overnight with inversion. The next day, the monoclonal was selected for first generation sequencing validation. After successful ligation and sequence error, plasmid extraction was performed.
The base editor and the corresponding base editing fusion protein are uniformly named as CE-SaABE8e, and editor carriers constructed by 26 different insertion sites are distinguished by suffixes, for example, the base editing fusion protein with the insertion site 736 and the editor thereof are named as CE-SaABE8e-736.
5. Editing of reporting systems by chimeric CE-SaABE8e
The specific targeting reporter protein is designed into the sgRNA target sequence of the non-coding chain where the 66 th amino acid codon is located, annealed to form an oligo double-chain, and connected to 26 linearized CE-SaABE8e expression vectors recovered by BsaI digestion to obtain 26 specific targeting target site sgRNA vector plasmids. Specific sgRNA vector construction methods are described as in case 1, "2, construction of sgRNA expression vector".
HEK293T cells were inoculated in DMEM medium containing 10% FBS and cultured at 37 ℃ under 5% CO 2. Cells were split into 24-well plates the day prior to transfection. The next day, transfection was performed until the density reached 70% -80%. According to the operating manual of EZTrans cell transfection liquid, 900ng of different chimeric CE-SaABE8e vector plasmids are uniformly mixed with 100ng of BFP-AG reporting system, the mixture is co-transfected into cells, the liquid is changed after 6 to 8 hours, and BFP fluorescent signals are analyzed after 48 hours.
The flow cytometry results were analyzed using flowjo software, and the effect of A-G base substitution was judged by comparison of BFP efficiency. As a result, the base editing vectors in which A-G editing occurred were CE-SaABE8e-730, CE-SaABE8e-731, CE-SaABE8e-732, CE-SaABE8e-733, CE-SaABE8e-734, CE-SaABE8e-735, CE-SaABE8e-736, CE-SaABE8e-737, CE-SaABE8e-738, CE-SaABE8e-739, CE-SaABE8e-740, CE-SaABE8e-741, CE-SaABE8e-742, CE-SaABE8e-743, CE-SaABE8e-744 (FIG. 4), and the four most efficient editing were CE-SaABE8e-733, CE-SaABE8e-736, CE-SaABE8e-739 and CE-SaABE8e-744, which correspond to the amino acid sequences of the chimeric fusion proteins shown in SEQ ID No. 10. The present application will subsequently be presented as a base editing experiment of mammalian cell sites and mouse endogenous gene sites, as represented by CE-SaABE8 e-736.
SEQ ID NO.10:
MKRTADGSEFESPKKKRKVKRNYILGLAIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGARRLKRRRRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYVAELQLERLKKDGEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYEGPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADLYNALNDLNNLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTSTGKPEFTNLKVYHDIKDITARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSELTQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQIAIFNRLKLVPKKVDLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSKDAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYSLEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTPFQYLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNLVDTRYATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKGYKHHAEDALIIANADFIFKEWKKLDKAKKVMENQMFSGSETPGTSESATPESGSSEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGEGWNRAIGLHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNSKRGAAGSLMNVLNYPGMNHRVEITEGILADECAALLCDFYRMPRQVFNAQKKAQSSINSGSGSETPGTSESATPESEEKQAESMPEIETEQEYKEIFITPHQIKHIKDFKDYKYSHRVDKKPNRELINDTLYSTRKDDKGNTLIVNNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQTYQKLKLIMEQYGDEKNPLYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSRNKVVKLSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQAEFIASFYNNDLIKINGELYRVIGVNNDLLNRIEVNMIDITYREYLENMNDKRPPRIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIKKGSGGSKRTADGSEFEPKKKRKV (representing stop codon)
SEQ ID NO.11:
MKRTADGSEFESPKKKRKVKRNYILGLAIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGARRLKRRRRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYVAELQLERLKKDGEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYEGPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADLYNALNDLNNLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTSTGKPEFTNLKVYHDIKDITARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSELTQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQIAIFNRLKLVPKKVDLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSKDAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYSLEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTPFQYLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNLVDTRYATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKGYKHHAEDALIIANADFIFKEWKKLDKAKKVMENQMFEEKSGSETPGTSESATPESGSSEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGEGWNRAIGLHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNSKRGAAGSLMNVLNYPGMNHRVEITEGILADECAALLCDFYRMPRQVFNAQKKAQSSINSGSGSETPGTSESATPESQAESMPEIETEQEYKEIFITPHQIKHIKDFKDYKYSHRVDKKPNRELINDTLYSTRKDDKGNTLIVNNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQTYQKLKLIMEQYGDEKNPLYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSRNKVVKLSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQAEFIASFYNNDLIKINGELYRVIGVNNDLLNRIEVNMIDITYREYLENMNDKRPPRIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIKKGSGGSKRTADGSEFEPKKKRKV (representing stop codon)
SEQ ID NO.12:
MKRTADGSEFESPKKKRKVKRNYILGLAIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGARRLKRRRRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYVAELQLERLKKDGEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYEGPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADLYNALNDLNNLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTSTGKPEFTNLKVYHDIKDITARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSELTQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQIAIFNRLKLVPKKVDLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSKDAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYSLEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTPFQYLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNLVDTRYATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKGYKHHAEDALIIANADFIFKEWKKLDKAKKVMENQMFEEKQAESGSETPGTSESATPESGSSEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGEGWNRAIGLHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNSKRGAAGSLMNVLNYPGMNHRVEITEGILADECAALLCDFYRMPRQVFNAQKKAQSSINSGSGSETPGTSESATPESSMPEIETEQEYKEIFITPHQIKHIKDFKDYKYSHRVDKKPNRELINDTLYSTRKDDKGNTLIVNNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQTYQKLKLIMEQYGDEKNPLYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSRNKVVKLSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQAEFIASFYNNDLIKINGELYRVIGVNNDLLNRIEVNMIDITYREYLENMNDKRPPRIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIKKGSGGSKRTADGSEFEPKKKRKV (representing stop codon)
SEQ ID NO.13:
MKRTADGSEFESPKKKRKVKRNYILGLAIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKRGARRLKRRRRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYVAELQLERLKKDGEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYEGPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAYNADLYNALNDLNNLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTSTGKPEFTNLKVYHDIKDITARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSELTQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQIAIFNRLKLVPKKVDLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSKDAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYSLEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTPFQYLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNLVDTRYATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKGYKHHAEDALIIANADFIFKEWKKLDKAKKVMENQMFEEKQAESMPEISGSETPGTSESATPESGSSEVEFSHEYWMRHALTLAKRARDEREVPVGAVLVLNNRVIGEGWNRAIGLHDPTAHAEIMALRQGGLVMQNYRLIDATLYVTFEPCVMCAGAMIHSRIGRVVFGVRNSKRGAAGSLMNVLNYPGMNHRVEITEGILADECAALLCDFYRMPRQVFNAQKKAQSSINSGSGSETPGTSESATPESETEQEYKEIFITPHQIKHIKDFKDYKYSHRVDKKPNRELINDTLYSTRKDDKGNTLIVNNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQTYQKLKLIMEQYGDEKNPLYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSRNKVVKLSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQAEFIASFYNNDLIKINGELYRVIGVNNDLLNRIEVNMIDITYREYLENMNDKRPPRIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIKKGSGGSKRTADGSEFEPKKKRKV (representing stop codon)
Example 3: efficient A-G mutation of CE-SaABE8e in HEK293T cells
In order to further study the action characteristics and efficiency of the A-G base substitution editor, 5 sites in HEK293T cells are edited. The specific implementation process is as follows:
1. the selection of the target site and the construction of the corresponding sgRNA expression vector.
The 5 sites were selected as follows:
Site1:GTGGTAGACAGCATGTGTCCTAAAGGGT(SEQ ID NO.29);
Site2:ATTTACAGCCTGGCCTTTGGGGTCGGGT(SEQ ID NO.30);
Site3:GGAGAGAAAGAGAAGTTGATTGATGGGT(SEQ ID NO.31);
Site4:GTGTCAGGTAATGTGCTAAACAGAGAGT(SEQ ID NO.32);
Site5:ATGCATTAACTGAAAATGGTCAAGGAGT(SEQ ID NO.33);
the corresponding sgRNA primer was designed and the upstream and downstream sequences were annealed by the procedure (95 ℃,5min;95 ℃ C. -85 ℃ C./2 ℃ C./s; 85 ℃ C. -25 ℃ C./0.1 ℃ C./s; holdat4 ℃ C.) and ligated to the CE-SaABE8e-736 vector linearized by BsaI. Positive clones were subjected to shaking extraction of plasmid (Axygene Co., code No.: AP-MN-P-250G) and the concentration was determined for use. Specific sgRNA vector construction methods are described as in case 1, "2, construction of sgRNA expression vector".
2. Editing of CE-SaABE8e in HEK293T cells
HEK293T cells were inoculated in DMEM medium containing 10% fbs and cultured at 37 ℃ under 5% co 2. Cells were split into 24-well plates the day prior to transfection. The next day, transfection was performed until the density reached 70% -80%. According to the operating manual of EZTrans cell transfection liquid, 900ng of sgRNA vector plasmid and 450ng of GFP fluorescence expression plasmid are uniformly mixed, and co-transfected into cells, the liquid is changed after 6-8 hours, and 10000 GFP positive cells are separated by flow cytometry after 48 hours. Cell pellet was lysed and the genotype was identified. The components of the lysate are: 50mM KCl,1.5mM MgCl2, 10mM Tris pH8.0,0.5% Nonidet P-40,0.5% Tween20, 100. Mu.g/ml Protease K.
3. Analysis of the edit Effect of CE-SaABE8e in HEK293T cells
Using a generation sanger sequencing, the present invention analyzed the 5 sites described above and counted the corresponding editing efficiencies (FIG. 6). The result shows that the CE-SaABE8e can realize efficient A-G editing, and the highest base substitution efficiency reaches 90 percent and the lowest base substitution efficiency reaches 26 percent.
Example 4: CE-SaABE8e can significantly reduce DNA and RNA off-target
The base editor reports off-target at the DNA and RNA level, and the invention analyzes off-target at the DNA and RNA level after editing the intracellular sites of mammals by CE-SaABE8 e.
HEK293T cells were inoculated in DMEM medium containing 10% fbs and cultured at 37 ℃ under 5% co 2. Cells were split into 6-well plates the day before transfection. The next day, transfection was performed until the density reached 70% -80%. According to the operating manual of EZTrans cell transfection solution, 4 μg of CE-SaABE8e or control SaABE8e plasmid is mixed with 2 μg of GFP plasmid, and the mixture is co-transfected into cells in corresponding wells, the solution is changed after 6-8 hours, and 500000 GFP positive cells are separated by flow cytometry after 48 hours.
The sorted cells were centrifuged to collect the pellet, and DNA and RNA were extracted for whole genome sequencing and RNA-seq sequencing. By comparison with the null-stained negative control cells, no significant difference was found between CE-SaABE8e off-target at DNA and RNA levels and the reference genome (fig. 7A-7B). The chimeric base editor developed in the application has the characteristics of high efficiency and specificity in realizing base editing in mammalian cells.
Example 5: AAV virus package and preparation of CE-SaABE8e recombinant expression vector
Adeno-associated viruses (AAVs) have been used to deliver genes encoding a number of therapeutic proteins in animal models of human diseases, clinical trials, and drugs approved by the united states food and drug administration. AAVs have evolved into a popular in vivo method of administration due to their advantages in clinical validation, ability to target various clinically relevant tissues, higher safety, more adequate research, etc. The CE-SaABE8e fusion protein developed by the application, the expression vector of which is an AAV recombinant vector (see the embodiment 2 for details), can be further packaged into AAV virus, and is delivered into a mammal body for base editing in an injection mode.
In order to implement in vivo editing experiments of mice, AAV virus loaded with CE-SaABE8e is prepared by the following steps of:
1. cell transfection
HEK293T cells were inoculated in DMEM medium containing 10% fbs and cultured at 37 ℃ under 5% co 2. When the degree of engagement reached 90%, the cells were plated in a 1:3 ratio (approximately 2.5X10 per plate) 6 ) Culturing was continued. Two hours prior to transfection, serum-free medium was exchanged. Cell transfection was initiated when the cell confluency was 80% -90%. 9mL of DMEM, 5.7. Mu.g of CE-SaABE8e-736 plasmid, 11.4. Mu.g of pHelper and 22.8. Mu.g of rep-cap plasmid, 1mL of PEI were added sequentially to a 15mL centrifuge tube, and after shaking and mixing, the mixture was allowed to stand at room temperature for 30min and added dropwise to 10cm dis. On day 1 after transfection, the cell culture broth was replaced with DMEM medium containing 10% FBS.
2. Virus collection and purification
Virus was harvested on day 4 post transfection. The cells were collected together with the medium into 50ml centrifuge tubes using a rubber cell scraper, centrifuged at 2,000g for 10min, and the cell pellet and the medium supernatant were harvested separately. Each plate of cell pellet was resuspended in 500. Mu.l of hypertonic lysis buffer (40 mM Tris-base,500mM NaCl,2mM MgCl) 2 And 100U mL -1 Salt active nuclease) and incubated at 37 ℃ for 1h to lyse cells) the culture supernatant was filtered using a 0.45 μm filter head, 5 x PEG-8000NaCl solution (40% PEG-8000,2.5mM NaCl) was added to give a final concentration of 1 x (8% PEG,500mM NaCl), after 2h incubation 3200g was centrifuged for 30min and the pellet was similarly resuspended in 500 μl hypertonic lysis buffer. The crude lysates obtained in the two steps above were combined, incubated overnight at 4 ℃ or immediately removed for ultracentrifugation. Cell lysisThe material was centrifuged at 2,000g for 10min and virus purification was performed using iodixanol density gradient centrifugation.
3. Virus concentration and titration
The previous step solution was exchanged into cold PBS containing 0.001% F-68 and concentrated using PES100 kDCMWCO chromatography column (ThermoFisher, pierce 88533). Concentrated AAV solutions were sterile filtered with a 0.22 μm filter, qPCR titrated for AAV virus using AAVpro titration kit version 2 (Clontech), and stored at 4 ℃ until use.
Example 6: CE-SaABE8e realizes efficient endogenous gene A-G mutation in mice
Gene editing offers the potential for clinical treatment of a variety of genetic diseases, but most genetic disease gene editing studies and treatments need to be performed in vivo. To investigate the functional characteristics and efficiency of the CE-SaABE8e base editor developed in this application for in vivo editing, a-G base substitution edits were performed on 5 endogenous gene loci of mice. The specific implementation is as follows:
1. the selection of the target site and the construction of the corresponding sgRNA expression vector.
The 5 sites were selected as follows:
PCSK9_exon1:GCCACCGCAGCCACGCAGAGCAGTGGGT(SEQ ID NO.34);
PCSK9_exon5:GCGTGCTTACCTGTCTGTGGAAGCGGGT(SEQ ID NO.35)
PCSK9_exon8:GCCATCCTGCTCACCTGTCTCATGGGT(SEQ ID NO.36)
PCSK9_exon9:GCCATCCTGCTTACCTGCCCCATGGGT(SEQ ID NO.37)
Angptl3_exon4:GTGTTTCCATGGGTTTACCTGATTGGGT(SEQ ID NO.38);
the corresponding sgRNA primer was designed and the upstream and downstream sequences were annealed by the procedure (95 ℃,5min;95 ℃ C. -85 ℃ C./2 ℃ C./s; 85 ℃ C. -25 ℃ C./0.1 ℃ C./s; holdat4 ℃ C.) and ligated to the CE-SaABE8e-736 vector linearized by BsaI. And (5) extracting plasmids from positive clones by shaking, and measuring the concentration for later use. Specific sgRNA vector construction methods are described as in case 1, "2, construction of sgRNA expression vector".
2. AAV production with sgRNA expression vectors
HEK293T cells were transfected with the 5 sgRNA expression vectors constructed in the previous step according to the method in case 5, and AAV virus was prepared by collection, purification and concentration.
2. Mice were injected and edited in vivo.
The C57BL/6J mice used in the experiments herein were purchased from The Jackson Laboratory company. Humanized PCSK9 mice have been reported. All mice were kept in one room, maintained for 12 hours of light and dark cycles, and provided standard rodent chow and water. No immunosuppression or other differential treatment was performed on the mice prior to injection or during the course of the experiment, except for pre-bleeding fasting as described below. Multiple bleeds were performed prior to tail vein delivery of AAV vector or control to collect pre-injection samples and to habituate animals to treatment during the course of the procedure.
Mice were randomly assigned to different 5 experimental groups and 1 control group. Prior to injection, AAV virus corresponding to each experimental group was aspirated by 4X 10 10 The dose of vg was diluted to 100. Mu.L with 0.9% strength sterile phosphate buffer (PBS, pH 7.4). 2-4% isoflurane is adopted to induce anesthesia to mice. After induction, in the case of no response to bilateral toe compression, the skin was gently pressed to protrude the right eye, the retrobulbar sinus was advanced using an insulin syringe, and AAV solution was slowly injected. A drop of the pramipexole dihydrochloride ophthalmic solution was then applied to the eye as an analgesic.
At the fourth week post AAV injection, liver specimens were collected: to collect liver tissue and greater serum amounts for detection, mice were euthanized by carbon dioxide inhalation, a portion of the minced liver tissue was collected for genomic DNA extraction, and a portion of the liver tissue was flash frozen in liquid nitrogen for RNA extraction. The whole genome sequencing and transcriptome sequencing were performed on DNA and RNA extracted from the livers of mice in the experimental group along with the control group, respectively.
3. Edit analysis of CE-SaABE8e in mice
Using high throughput sequencing, the present invention analyzed the 5 sites described above and counted the corresponding editing efficiencies (FIG. 8). The result shows that the CE-SaABE8e can realize efficient A-G editing in the mammal body, and the highest base substitution efficiency reaches 63% and the lowest base substitution efficiency reaches 35%.
While the basic concepts have been described above, it will be apparent to those skilled in the art that the foregoing detailed disclosure is by way of example only and is not intended to be limiting. Although not explicitly described herein, various modifications, improvements, and adaptations to the present disclosure may occur to one skilled in the art. Such modifications, improvements, and modifications are intended to be suggested within this specification, and therefore, such modifications, improvements, and modifications are intended to be included within the spirit and scope of the exemplary embodiments of the present invention.
Meanwhile, the specification uses specific words to describe the embodiments of the specification. Reference to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic is associated with at least one embodiment of the present description. Thus, it should be emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various positions in this specification are not necessarily referring to the same embodiment. Furthermore, certain features, structures, or characteristics of one or more embodiments of the present description may be combined as suitable.
In some embodiments, numbers describing the components, number of attributes are used, it being understood that such numbers being used in the description of embodiments are modified in some examples by the modifier "about," approximately, "or" substantially. Unless otherwise indicated, "about," "approximately," or "substantially" indicate that the number allows for a 20% variation. Accordingly, in some embodiments, numerical parameters set forth in the specification and claims are approximations that may vary depending upon the desired properties sought to be obtained by the individual embodiments. In some embodiments, the numerical parameters should take into account the specified significant digits and employ a method for preserving the general number of digits. Although the numerical ranges and parameters set forth herein are approximations that may be employed in some embodiments to confirm the breadth of the range, in particular embodiments, the setting of such numerical values is as precise as possible.
Finally, it should be understood that the embodiments described in this specification are merely illustrative of the principles of the embodiments of this specification. Other variations are possible within the scope of this description. Thus, by way of example, and not limitation, alternative configurations of embodiments of the present specification may be considered as consistent with the teachings of the present specification. Accordingly, the embodiments of the present specification are not limited to only the embodiments explicitly described and depicted in the present specification.

Claims (23)

1. A fusion protein comprising, in order from the N-terminus to the C-terminus, a first SaCas9 nickase fragment, a chimeric deaminase fragment, a second SaCas9 nickase fragment; the deaminase is selected from adenosine deaminase or a variant thereof, the adenosine deaminase is selected from ecTadA8e, the amino acid sequence of the ecTadA8e is shown as SEQ ID No.20, or the deaminase has more than 80% sequence identity with the amino acid sequence shown as SEQ ID No.20, and the deaminase has the function or activity of the ecTadA8 e.
2. The fusion protein of claim 1, wherein the fusion protein is a chimeric protein of a SaCas9 nickase and a deaminase fragment, wherein the chimeric site of the deaminase fragment is selected from the group consisting of positions 730-744 of the SaCas9 nickase amino acid sequence; preferably, the chimeric site of the deaminase fragment is selected from the 733, 736, 739 or 744 of the SaCas9 nickase amino acid sequence.
3. The fusion protein of any one of claim 1, further comprising a nuclear localization signal fragment;
preferably, the nuclear localization signal fragment is located at the N-terminus and/or the C-terminus of the fusion protein;
further preferably, the nuclear localization signal fragment is an optimized nuclear localization signal (BPNLS) or a variant thereof, the amino acid sequence of which is shown in SEQ ID No.17, and the amino acid sequence of which has more than 80% sequence identity with BPNLS and has the function of BPNLS.
4. The fusion protein of claim 1, further comprising a first flexible linker peptide fragment positioned between the first SaCas9 nickase fragment and the chimeric deaminase fragment and a second flexible linker peptide fragment positioned between the chimeric deaminase fragment and the second SaCas9 nickase fragment;
the amino acid sequence of the first flexible connecting peptide fragment is shown as SEQ ID NO.18, or has more than 80% sequence identity with the amino acid sequence shown as SEQ ID NO.18, and has the function or activity of the first flexible connecting peptide fragment;
the amino acid sequence of the second flexible connecting peptide fragment is shown as SEQ ID NO.19, or has more than 80% sequence identity with the amino acid sequence shown as SEQ ID NO.19, and has the function or activity of the second flexible connecting peptide fragment.
5. The fusion protein of claims 1-4, wherein the amino acid sequence of the fusion protein comprises an amino acid sequence as set forth in any one of SEQ ID nos. 10-13, or an amino acid sequence having greater than 80% sequence identity to one of the amino acid sequences set forth in SEQ ID nos. 10-13 and having the function of the amino acid sequence defined in SEQ ID nos. 10-13.
6. An isolated polynucleotide encoding the fusion protein of any one of claims 1-5.
7. An expression vector comprising the isolated polynucleotide of claim 6.
8. An expression system comprising the expression vector of claim 7 or the polynucleotide of claim 6 integrated with an exogenous source in the genome.
9. The expression system of claim 8, wherein the host cell of the expression system is selected from eukaryotic cells or prokaryotic cells; preferably, the host cell is selected from the group consisting of a mouse cell, a human cell.
10. A base editing system comprising the fusion protein of any one of claims 1-5 or a polynucleotide encoding the same.
11. The base editing system of claim 10, further comprising a guide RNA that directs targeting of the fusion protein to a target.
12. The base editing system according to claim 10, comprising at least any one of:
1) The base editing system comprises one or more vectors; the one or more vectors comprise (i) a first regulatory element operably linked to the encoding polynucleotide of the fusion protein; and (ii) a second regulatory element operably linked to the encoding polynucleotide of the guide RNA nucleotide sequence;
The (i) and (ii) are on the same or different supports;
2) The base editing system comprises (i) a fusion protein, and (ii) a guide RNA or a vector comprising a polynucleotide encoding the guide RNA.
13. Use of the fusion protein according to any one of claims 1 to 5, and/or the isolated polynucleotide according to claim 6, and/or the expression vector according to claim 7, and/or the expression system according to claims 8 to 9, and/or the base editing system according to any one of claims 10 to 12 in gene editing.
14. Use according to claim 13, comprising at least any one of the following:
1) The gene editing realizes base substitution;
2) The gene editing realizes the replacement of A to G or the replacement of T to C;
3) The gene editing is used for at least one of correction of pathogenic sites, gene function research, enhancement of cell functions and cell treatment;
4) The fusion proteins, isolated polynucleotides, expression vectors, expression systems or base editing systems are used in combination with other drugs or agents.
15. The use according to claim 14, wherein the disease caused by the treatment site is selected from at least any one of the following: autoimmune diseases, tumors, viral infectious diseases, bacterial infectious diseases.
16. A method of gene editing comprising: base editing of a target sequence by a fusion protein according to any one of claims 1 to 5, an isolated polynucleotide according to claim 6, an expression vector according to claim 7 or an expression system according to any one of claims 8 to 9 or a base editing system according to any one of claims 10 to 12.
17. The method of claim 16, comprising at least any one of:
1) The method is carried out in vitro; preferably, the method is carried out in cultured cells;
2) The method is performed in vivo; preferably, the method is carried out in a mammal; further preferably, the method is performed in a rodent or primate; still further preferably, the method is carried out in a mouse or human.
18. A base-edited cell obtained by a gene editing method according to claim 16 or 17, wherein the target sequence in the cell is subjected to a to G or T to C gene editing.
19. A reporter system comprising a nucleotide sequence set forth in SEQ ID No. 15.
20. The reporting system of claim 19, comprising at least any one of:
1) The reporting system comprises a nucleotide sequence shown as SEQ ID NO.15, and the reporting system displays green fluorescence; when the nucleotide sequence shown as SEQ ID NO.15 is mutated to SEQ ID NO.16, the reporting system presents blue fluorescence; when the nucleotide sequence shown in SEQ ID No.15 is mutated to SEQ ID No.14, the reporter system does not exhibit fluorescence.
2) The reporter system comprises a plasmid comprising a nucleotide sequence as set forth in SEQ ID NO. 15;
3) The reporting system also includes a guide RNA or a vector comprising a polynucleotide sequence encoding the guide RNA.
21. The reporter system of claim 20, wherein the guide RNA targets the codon sequence of amino acid 66 of the reporter protein encoded by SEQ ID No. 15; preferably, the target sequence of the guide RNA is shown in SEQ ID NO. 5.
22. Use of a reporting system according to claims 19 to 21 for detecting the a-G editing efficiency of a fusion protein according to any one of claims 1 to 5, an isolated polynucleotide according to claim 6, an expression vector according to claim 7 or an expression system according to claim 8 or 9 or a base editing system according to any one of claims 10 to 12.
23. A method of detecting the a-G editing efficiency of a base editing product, using the reporting system of any one of claims 19 to 22 to detect the a-G editing efficiency of a product to be tested.
CN202310186267.3A 2023-03-01 2023-03-01 Fusion protein, efficient specific base editing system containing same and application Pending CN116239703A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310186267.3A CN116239703A (en) 2023-03-01 2023-03-01 Fusion protein, efficient specific base editing system containing same and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310186267.3A CN116239703A (en) 2023-03-01 2023-03-01 Fusion protein, efficient specific base editing system containing same and application

Publications (1)

Publication Number Publication Date
CN116239703A true CN116239703A (en) 2023-06-09

Family

ID=86625829

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310186267.3A Pending CN116239703A (en) 2023-03-01 2023-03-01 Fusion protein, efficient specific base editing system containing same and application

Country Status (1)

Country Link
CN (1) CN116239703A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116515766A (en) * 2023-06-30 2023-08-01 上海贝斯昂科生物科技有限公司 Natural killer cell, preparation method and application thereof
CN117568313A (en) * 2024-01-15 2024-02-20 上海贝斯昂科生物科技有限公司 Gene editing composition and use thereof

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116515766A (en) * 2023-06-30 2023-08-01 上海贝斯昂科生物科技有限公司 Natural killer cell, preparation method and application thereof
CN117568313A (en) * 2024-01-15 2024-02-20 上海贝斯昂科生物科技有限公司 Gene editing composition and use thereof
CN117568313B (en) * 2024-01-15 2024-04-26 上海贝斯昂科生物科技有限公司 Gene editing composition and use thereof

Similar Documents

Publication Publication Date Title
US11555181B2 (en) Engineered cascade components and cascade complexes
US11479761B2 (en) Nuclease-mediated genome editing
JP7605852B2 (en) Class II V-type CRISPR system
US9738908B2 (en) CRISPR/Cas systems for genomic modification and gene modulation
JP2023168355A (en) Methods for improved homologous recombination and compositions thereof
EP3487992A2 (en) Methods and compositions for modifying genomic dna
CN116239703A (en) Fusion protein, efficient specific base editing system containing same and application
JP6913965B2 (en) Kit for repairing FBN1T7498C mutation, method for preparing and repairing FBN1T7498C mutation, method for repairing FBN1T7498C mutation by base editing
US20230212612A1 (en) Genome editing system and method
CN112899237A (en) CDKN1A gene reporter cell line and construction method and application thereof
WO2018031864A1 (en) Methods and compositions related to barcode assisted ancestral specific expression (baase)
US20210363206A1 (en) Proteins that inhibit cas12a (cpf1), a cripr-cas nuclease
CN112159801B (en) SlugCas9-HF protein, gene editing system containing SlugCas9-HF protein and application
WO2019089623A1 (en) Fusion proteins for use in improving gene correction via homologous recombination
US20250002882A1 (en) Cpf1 protein and its use in gene editing
CN109593743A (en) Novel C RISPR/ScCas12a albumen and preparation method thereof
US20220307011A1 (en) Coiled-coil mediated tethering of crispr/cas and exonucleases for enhanced genome editing
JP2019523005A (en) Targeted in situ protein diversification by site-specific DNA cleavage and repair
CN114686456B (en) Base editing system based on bimolecular deaminase complementation and application thereof
CN109957570A (en) Targeted editing of gRNA sequence of bcr-abl fusion gene and its application
CN118119707A (en) Use of inhibitors to increase CRISPR/Cas insertion efficiency
CN116286741A (en) Use of 5 '. Fwdarw.3' exonuclease in gene editing system, gene editing system and editing method thereof
CN114317492A (en) A modified artificial nuclease system and its application
CN115772523A (en) Base editing tool
CN105695509B (en) Method for obtaining high-purity myocardial cells

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination