The HIV Long Terminal Repeat: Structure and Function

The human immunodeficiency virus (HIV) is classified as a retrovirus, which converts its RNA genome into DNA upon infecting a host cell. This viral DNA, or provirus, requires a regulatory sequence to establish a permanent presence within the host cell’s genetic material and to control its replication. This sequence is the Long Terminal Repeat (LTR), a repeated structure found at both ends of the integrated provirus. The LTR acts as the central control element, directing the virus’s ability to integrate into the host genome and subsequently dictating the timing and level of viral gene expression.

Molecular Structure and Assembly

The Long Terminal Repeat is not present in the original RNA genome of the virus; it is synthesized during the reverse transcription process inside the host cell. This process converts the single-stranded viral RNA into double-stranded viral DNA, resulting in the creation of two identical LTR sequences that flank the viral coding genes. Each LTR is structurally divided into three distinct regions: Unique 3′ (U3), Repeat (R), and Unique 5′ (U5).

The LTR sequences are derived from both the 3′ and 5′ ends of the original RNA, resulting in the final U3-R-U5 structure being a duplication of regulatory information. The R region is a short, repeated sequence, while the U3 and U5 regions contain unique sequences derived from the respective ends of the viral RNA.

The U3 region contains the primary regulatory elements for transcription control, including the promoter and enhancer sequences. The R region contains the Trans-Activation Response (TAR) element, an RNA sequence that folds into a hairpin structure and binds to a specific viral protein. These three regions, precisely duplicated and positioned at the ends of the proviral DNA, create the molecular framework necessary for the subsequent steps of the viral life cycle.

Role in Viral Integration

The LTR is indispensable for the permanent insertion of the viral DNA into the host cell’s chromosome, a process catalyzed by the viral integrase enzyme. Once the LTRs are formed at the ends of the double-stranded viral DNA, they become the specific recognition sites for integrase. Integrase engages the LTR ends within a large complex of viral and cellular proteins known as the pre-integration complex.

The integration process begins with the integrase enzyme precisely trimming two nucleotides from the 3′ end of both LTRs, a reaction known as 3′-processing. These modified LTR ends are then used by integrase to perform a strand transfer reaction, directly joining the viral DNA to the host cell’s chromosomal DNA. This action permanently inserts the provirus into the host genome, with the LTRs serving as the physical boundaries for the integrated genetic material.

Regulating Viral Gene Expression

Once the provirus is integrated, the 5′ LTR controls the transcription of all viral genes. The U3 region within the 5′ LTR acts as a combined promoter and enhancer, recruiting host cell machinery to initiate RNA synthesis. This region contains the core promoter elements, including the TATA box.

The U3 region also features binding sites for numerous host transcription factors, which act as the primary modulators of LTR activity. For instance, the LTR enhancer region contains binding sites for the host factor NF-κB. The LTR’s activity is further amplified by the viral protein Tat, which binds to the Trans-Activation Response (TAR) element located in the R region of the newly transcribed viral RNA.

The interaction of Tat with the TAR RNA hairpin recruits host factors that promote the efficient elongation of the viral RNA transcript. The complex interplay between host transcription factors binding to the U3 region and the Tat-TAR system creates a feedback loop that determines whether the virus actively replicates or remains transcriptionally silent, a state known as latency. When host factors are sequestered or Tat levels are low, the LTR remains quiet, enforcing the latent state of the provirus.

Targeting the LTR in Treatment

The LTR’s central role in integration and transcriptional control makes it a target for anti-HIV therapies. The integration step is successfully inhibited by a class of drugs known as integrase strand transfer inhibitors, which bind to the integrase enzyme and prevent it from processing and inserting the LTR ends into the host DNA.

Beyond integration, the LTR’s function as a transcriptional regulator is the focus of strategies aimed at addressing the latent viral reservoir. The persistence of HIV latency, where the provirus is silenced by a repressive chromatin structure at the LTR, is the main barrier to a cure. Two primary approaches are being explored to overcome this latency: “shock and kill” and “block and lock.”

The “shock and kill” strategy uses latency-reversing agents to force the LTR to become active, or “shock” the virus out of its dormant state so it can be eliminated by the immune system or viral cytopathic effects. Conversely, the “block and lock” approach seeks to permanently silence the LTR using latency-promoting agents that enhance repressive epigenetic modifications at the promoter. This strategy aims to lock the provirus in a deep, non-inducible state, effectively achieving a functional cure by preventing any future viral transcription. One example of a “block and lock” agent is a Tat inhibitor, such as didehydro-cortistatin A, which prevents the Tat-TAR amplification loop, thereby silencing transcription from the LTR.