The genome and proteins of HIV (human immunodeficiency virus) have been the subject of extensive research since the discovery of the virus in 1983. "In the search for the causative agent, it was initially believed that the virus was a form of the Human T-cell leukemia virus (HTLV), which was known at the time to affect the human immune system and cause certain leukemias. However, researchers at the Pasteur Institute in Paris isolated a previously unknown and genetically distinct retrovirus in patients with AIDS which was later named HIV." Each virion comprises a viral envelope and associated matrix enclosing a capsid , which itself encloses two copies of the single-stranded RNA genome and several enzymes . The discovery of the virus itself occurred two years following the report of the first major cases of AIDS-associated illnesses.
35-456: P17 may refer to: p17 protein , a protein of the HIV virus Papyrus 17 , a biblical manuscript English Electric P.17 , a fighter jet take-off platform concept See also [ edit ] 17P (disambiguation) [REDACTED] Topics referred to by the same term This disambiguation page lists articles associated with the same title formed as
70-410: A DNA sequence. The presence of an ORF does not necessarily mean that the region is always translated . For example, in a randomly generated DNA sequence with an equal percentage of each nucleotide , a stop-codon would be expected once every 21 codons . A simple gene prediction algorithm for prokaryotes might look for a start codon followed by an open reading frame that is long enough to encode
105-571: A stop codon (usually UAA, UAG or UGA in RNA). That start codon (not necessarily the first) indicates where translation may start. The transcription termination site is located after the ORF, beyond the translation stop codon. If transcription were to cease before the stop codon, an incomplete protein would be made during translation. In eukaryotic genes with multiple exons , introns are removed and exons are then joined together after transcription to yield
140-402: A DNA strand has three distinct reading frames. The double helix of a DNA molecule has two anti-parallel strands; with the two strands having three reading frames each, there are six possible frame translations. The ORF Finder (Open Reading Frame Finder) is a graphical analysis tool which finds all open reading frames of a selectable minimum size in a user's sequence or in a sequence already in
175-564: A conical capsid composed of the viral protein p24 , typical of lentiviruses . The two RNAs are often identical, yet they are not independent, but form a compact dimer within the virion. Several reasons as for why two copies of RNA are packaged rather than just one have been proposed, including probably a combination of these advantages: One advantage is that the two copies of RNA strands are vital in contributing to HIV-1 recombination, which occurs during reverse transcription of viral replication, thus increasing genetic diversity. Another advantage
210-518: A hit in BLASTX, the program predicts the coding regions based on the translation reading frames identified in BLASTX alignments, otherwise, it predicts the most probable coding region based on the intrinsic signals of the query sequences. The output is the predicted peptide sequences in the FASTA format, and a definition line that includes the query ID, the translation reading frame and the nucleotide positions where
245-600: A less than 10kb genome. HIV has a 9.2kb unspliced genomic transcript which encodes for gag and pol precursors; a singly spliced, 4.5 kb encoding for env, Vif, Vpr and Vpu and a multiply spliced, 2 kb mRNA encoding for Tat, Rev and Nef. Several conserved secondary structure elements have been identified within the HIV RNA genome . The HIV viral RNA structures regulates the progression of reverse transcription. The 5'UTR structure consists of series of stem-loop structures connected by small linkers. These stem-loops (5' to 3') include
280-543: A letter–number combination. If an internal link led you here, you may wish to change the link to point directly to the intended article. Retrieved from " https://en.wikipedia.org/w/index.php?title=P17&oldid=1218061338 " Category : Letter–number combination disambiguation pages Hidden categories: Short description is different from Wikidata All article disambiguation pages All disambiguation pages P17 protein The complete sequence of
315-549: A subset of patients that have been infected for many months to years) bind to or, are adapted to cope with, these envelope glycans. The molecular structure of the viral spike has now been determined by X-ray crystallography and cryo-electron microscopy . These advances in structural biology were made possible due to the development of stable recombinant forms of the viral spike by the introduction of an intersubunit disulphide bond and an isoleucine to proline mutation in gp41. The so-called SOSIP trimers not only reproduce
350-452: A typical protein, where the codon usage of that region matches the frequency characteristic for the given organism's coding regions. Therefore, some authors say that an ORF should have a minimal length, e.g. 100 codons or 150 codons. By itself even a long open reading frame is not conclusive evidence for the presence of a gene . Some short ORFs (sORFs), also named Small open reading frames , usually < 100 codons in length, that lack
385-418: A virion but the production of only a single DNA provirus is called pseudodiploidy. The RNA component is 9749 nucleotides long and bears a 5’ cap (Gppp), a 3’ poly(A) tail , and many open reading frames (ORFs). Viral structural proteins are encoded by long ORFs, whereas smaller ORFs encode regulators of the viral life cycle: attachment, membrane fusion, replication, and assembly. The single-strand RNA
SECTION 10
#1732859409612420-504: Is a R-package in Bioconductor for finding open reading frames and using Next generation sequencing technologies for justification of ORFs. orfipy is a tool written in Python / Cython to extract ORFs in an extremely and fast and flexible manner. orfipy can work with plain or gzipped FASTA and FASTQ sequences, and provides several options to fine-tune ORF searches; these include specifying
455-485: Is a program which not only gives information about the coding and non coding sequences but also can perform pairwise global alignment of different gene/DNA regions sequences. The tool efficiently finds the ORFs for corresponding amino acid sequences and converts them into their single letter amino acid code, and provides their locations in the sequence. The pairwise global alignment between the sequences makes it convenient to detect
490-493: Is considered within a studied region of a prokaryotic DNA sequence, where only one of the six possible reading frames will be "open" (the "reading", however, refers to the RNA produced by transcription of the DNA and its subsequent interaction with the ribosome in translation ). Such an ORF may contain a start codon (usually AUG in terms of RNA ) and by definition cannot extend beyond
525-417: Is essential for HIV-1 entry into cells. Env serves as a molecular target of a medicine treating individuals with HIV-1 infection, and a source of immunogen to develop AIDS vaccine. However, the structure of the functional Env trimer has remained elusive. Open reading frame In molecular biology , reading frames are defined as spans of DNA sequence between the start and stop codons . Usually, this
560-402: Is responsible for binding to its primary host receptor, CD4, and its co-receptor (mainly CCR5 or CXCR4 ), leading to viral entry into its target cell. As the only proteins on the surface of the virus, the envelope glycoproteins (gp120 and gp41) are the major targets for HIV vaccine efforts. Over half of the mass of the trimeric envelope spike is N-linked glycans . The density is high as
595-402: Is that having two copies of RNA would allow the reverse transcriptase to switch templates when encountering a break in the viral RNA, thus completing the reverse transcription without loss of genetic information. Yet another reason is that the dimeric nature of the RNA genome of the virus may play a structural role in viral replication. The containment of two copies of single-stranded RNA within
630-406: Is tightly bound to p7 nucleocapsid proteins, late assembly protein p6, and enzymes essential to the development of the virion, such as reverse transcriptase and integrase . Lysine tRNA is the primer of the magnesium-dependent reverse transcriptase. The nucleocapsid associates with the genomic RNA (one molecule per hexamer) and protects the RNA from digestion by nucleases . Also enclosed within
665-511: Is ~100 nm in diameter. Its innermost region consists of a cone-shaped core that includes two copies of the (positive sense) ssRNA genome, the enzymes reverse transcriptase , integrase and protease , some minor proteins, and the major core protein. The genome of human immunodeficiency virus (HIV) encodes 8 viral proteins playing essential roles during the HIV life cycle. HIV-1 is composed of two copies of noncovalently linked , unspliced , positive-sense single-stranded RNA enclosed by
700-599: The glycans shield underlying viral protein from neutralisation by antibodies . This is one of the most densely glycosylated molecules known and the density is sufficiently high to prevent the normal maturation process of glycans during biogenesis in the endoplasmic reticulum and Golgi apparatus . The majority of the glycans are therefore stalled as immature 'high- mannose ' glycans not normally present on secreted or cell surface human glycoproteins. The unusual processing and high density means that almost all broadly neutralising antibodies that have so far been identified (from
735-520: The HIV family and is thought to influence the viral life cycle. The third variable loop or V3 loop is a part or region of the Human Immunodeficiency Virus . The V3 loop of the viron's envelope glycoprotein, gp120 , allows it to infect human immune cells by binding to a cytokine receptor on the target human immune cell, such as a CCR5 cell or CXCR4 cell, depending on the strain of HIV . The envelope glycoprotein (Env) gp 120/41
SECTION 20
#1732859409612770-400: The HIV life cycle by altering the function of HIV protease and reverse transcriptase , although not all elements identified have been assigned a function. An RNA secondary structure determined by SHAPE analysis has shown to contain three stem loops and is located between the HIV protease and reverse transcriptase genes. This cis regulatory RNA has been shown to be conserved throughout
805-449: The HIV-1 genome, extracted from infectious virions, has been solved to single- nucleotide resolution. The HIV genome encodes a small number of viral proteins , invariably establishing cooperative associations among HIV proteins and between HIV and host proteins, to invade host cells and hijack their internal machineries. HIV is different in structure from other retroviruses . The HIV virion
840-705: The antigenic properties of the native viral spike but also display the same degree of immature glycans as presented on the native virus. Recombinant trimeric viral spikes are promising vaccine candidates as they display less non-neutralising epitopes than recombinant monomeric gp120 which act to suppress the immune response to target epitopes. HIV has several major genes coding for structural proteins that are found in all retroviruses as well as several nonstructural ("accessory") genes unique to HIV. The HIV genome contains nine genes that encode fifteen viral proteins. These are synthesized as polyproteins which produce proteins for virion interior, called Gag, group specific antigen;
875-453: The basic physical infrastructure of the virus, and pol provides the basic mechanism by which retroviruses reproduce, while the others help HIV to enter the host cell and enhance its reproduction. Though they may be altered by mutation, all of these genes except tev exist in all known variants of HIV; see Genetic variability of HIV . HIV employs a sophisticated system of differential RNA splicing to obtain nine different gene products from
910-526: The classical hallmarks of protein-coding genes (both from ncRNAs and mRNAs) can produce functional peptides. 5’-UTR of about 50% of mammal mRNAs are known to contain one or several sORFs, also called upstream ORFs or uORFs . However, less than 10% of the vertebrate mRNAs surveyed in an older study contained AUG codons in front of the major ORF. Interestingly, uORFs were found in two thirds of proto-oncogenes and related proteins. 64–75% of experimentally found translation initiation sites of sORFs are conserved in
945-447: The coding region begins and ends. OrfPredictor facilitates the annotation of EST-derived sequences, particularly, for large-scale EST projects. ORF Predictor uses a combination of the two different ORF definitions mentioned above. It searches stretches starting with a start codon and ending at a stop codon. As an additional criterion, it searches for a stop codon in the 5' untranslated region (UTR or NTR, nontranslated region ). ORFik
980-455: The context of transcriptomics and metagenomics , where a start or stop codon may not be present in the obtained sequences. Such an ORF corresponds to parts of a gene rather than the complete gene. One common use of open reading frames (ORFs) is as one piece of evidence to assist in gene prediction . Long ORFs are often used, along with other evidence, to initially identify candidate protein-coding regions or functional RNA -coding regions in
1015-520: The database. This tool identifies all open reading frames using the standard or alternative genetic codes. The deduced amino acid sequence can be saved in various formats and searched against the sequence database using the basic local alignment search tool (BLAST) server. The ORF Finder should be helpful in preparing complete and accurate sequence submissions. It is also packaged with the Sequin sequence submission software (sequence analyser). ORF Investigator
1050-435: The different mutations, including single nucleotide polymorphism . Needleman–Wunsch algorithms are used for the gene alignment. The ORF Investigator is written in the portable Perl programming language , and is therefore available to users of all common operating systems. OrfPredictor is a web server designed for identifying protein-coding regions in expressed sequence tag (EST)-derived sequences. For query sequences with
1085-428: The final mRNA for protein translation. In the context of gene finding , the start-stop definition of an ORF therefore only applies to spliced mRNAs , not genomic DNA, since introns may contain stop codons and/or cause shifts between reading frames. An alternative definition says that an ORF is a sequence that has a length divisible by three and is bounded by stop codons. This more general definition can be useful in
P17 - Misplaced Pages Continue
1120-413: The genomes of human and mouse and may indicate that these elements have function. However, sORFs can often be found only in the minor forms of mRNAs and avoid selection; the high conservation of initiation sites may be connected with their location inside promoters of the relevant genes. This is characteristic of SLAMF1 gene, for example. Since DNA is interpreted in groups of three nucleotides (codons),
1155-567: The trans-activation region (TAR) element, the 5' polyadenylation signal [poly(A)], the PBS, the DIS, the major SD and the ψ hairpin structure located within the 5' end of the genome and the HIV Rev response element (RRE) within the env gene. Another RNA structure that has been identified is gag stem loop 3 (GSL3) , thought to be involved in viral packaging. RNA secondary structures have been proposed to affect
1190-409: The viral enzymes (Pol, polymerase) or the glycoproteins of the virion env (envelope). In addition to these, HIV encodes for proteins which have certain regulatory and auxiliary functions as well. HIV-1 has two important regulatory elements: Tat and Rev and few important accessory proteins such as Nef, Vpr, Vif and Vpu which are not essential for replication in certain tissues. The gag gene provides
1225-424: The virion particle are Vif , Vpr , Nef , and viral protease . The envelope of the virion is formed by a plasma membrane of host cell origin, which is supported by a matrix composed of the viral p17 protein, ensuring the integrity of the virion particle. At the surface of the virion can be found a limited number of the envelope glycoprotein (Env) of HIV, a trimer formed by heterodimers of gp120 and gp41 . Env
#611388