Frequency of allelic variants of the TMPRSS2 gene in a prostate cancer-free Southwestern Colombian population Prevalencia de variantes alélicas del gen TMPRSS2 en una población del suroeste de Colombia libre de cáncer de próstata

OBJECTIVE: To describe the frequency of the TMPRSS2 gene and its variants in a prostate cancer-free Southwestern Colombian population. MATERIALS AND METHODS: An observational study was conducted that included cancerfree persons, regardless of age, from Southwestern Colombia. Blood samples were drawn from the patients for DNA extraction. Blood drops were collected and dried on filters and immersed in phosphate buffer, utilizing the DNeasy kit. The preparation process was carried out using the TruSeq Exome Library Prep® kit and the resulting libraries were normalized with the TruSeq Rapid Exome® kit. The commercial kits were provided by Illumina®. We sequenced the full exome and identified the variants associated with the TMPRSS2 gene. Descriptive statistics were employed for the data analysis. RESULTS: The study population was made up of 162 persons from whom 7,315,466 sequence data were obtained. The TMPRSS2 gene was found in 414 data (4.3%). The most common SNP was rs140530035 (32.1%) and the most relevant SNP sequenced was rs12329760 (10.6%). CONCLUSION: TMPRSS2 was not frequent in the population studied. The most important polymorphism associated with the TMPRSS2 gene was rs12329760.


INTRODUCTION
One of the most prevalent neoplastic pathologies associated with male sex is prostate cancer. The estimated prevalence is 1.1 million people worldwide [1][2][3] and it is impacted by ethnicity and geographic location. 4 Populations of African descent are the most affected, showing an 11% increase in prevalence in recent years. 5 The Southwest region of Colombia is inhabited by populations of Latin American and African descent in approximately the same proportion but with different rates of disease incidence. 6 Variants of certain genes have been associated with a higher frequency of prostate cancer (BRCA1-2, ATM, NBN, TMPRSS2, among others). 7 Serine proteases, such as the TMPRSS2 gene, are recognized through their mechanisms of action in inflammatory and immune processes. That gene is located on chromosome 21q22.3 and is expressed at the apex of the secretory epithelium of the glands. Fusion with members of the ETS family is the most frequent chromosomal re-arrangement found in 50% of prostate cancers, mainly produced by the microdeletion of a portion of the TMPRSS2 gene. 8 The TMPRSS2 gene and the fusion gene (TMPRSS2:ERG) have been associated with the severity and prognosis of prostate cancer, although the actual pathophysiologic process or the variant associated with that condition are not very well known. 9 The fusion gene has been widely studied and at present has been postulated as one of the most important biomarkers for diagnostic and prognostic purposes in the prostate cancer population. 10 There are reports in the literature on the single nucleotide polymorphisms (SNPs) most frequently related to those clinical scenarios.
The present study is important because there are no similar descriptive studies characterizing the presence of the TMPRSS2 gene and its variants in a population from Southwestern Colombia.
Our study focuses on describing the frequency of the allelic variants of the TMPRSS2 gene in that population.

MATERIALS AND METHODS
A descriptive, observational study was conducted on persons, regardless of age, from Southwestern Colombia (Nariño, Cauca, Putumayo, and Valle), within the time frame of 2014 to 2016.

Sample size
According to the expected frequency for hereditary prostate cancer (≈ 15%), alpha 5%, and an expected error of 5%, the calculated sample size was 162 people and convenience sampling was carried out.
Complete exome sequencing was performed, which enabled the sequencing of all proteincoding regions (exome) in the genome, thus identifying the variants that could alter the sequence of a protein. It was carried out as follows:

DNA extraction
Blood was drawn from each patient for DNA extraction. All drops of blood were collected and dried on filter paper. The filter paper was then immersed in a phosphate buffer utilizing the DNeasy kit from the QIAGEN ® company (Hilden, Germany-Operational). Each extraction was quantified, and its quality was verified, to continue the sequencing processing. fragments with their corresponding adaptors for sequencing were charged in a HiSeq2500 machine.
We sequenced the full exome and identified the related variants, specifically the SNPs for the TMPRSS2 gene that is associated with prostate cancer (PCa).
The present project was conducted following all ethical international standards. Descriptive statistics were performed in R and the results are shown in frequency tables for each gene and its associated variants. Finally, we looked for the variants in the following public databases: Exome Aggregation Consortium (ExAC), PharmGKB, 11 Clinvar, 12 Ensemble, and dbSNP, 13 searching for a pattern through which we could use the variants we found as markers.

RESULTS
One hundred sixty-two patients were included in the study, providing 7,315,466 sequence data, and the TMPRSS2 gene was found in 414 data (4.3%). Missense variants were identified in 23% of the data, although the most frequent variants were synonymous variants and introns. Only one stop variant was found in those data ( Table 1).

DISCUSSION
Transmembrane protease serine 2, also called TMPRSS2, is a protease composed of 492 amino acids expressed on the cell surface of multiple organs and they are theorized to be strategically located to regulate cell-cell interactions. The TMPRSS2 gene has been shown to be positively regulated by androgenic hormones in neoplastic tissue, possibly modulating the inflammatory response of prostate cells through the activation of PAR-2. [14][15] Prostate cancer is one of the most frequent cancers in males and the TMPRSS2 gene has historically been associated with that malignant tumor. Numerous authors have conducted studies over the past decades in an attempt to link the presence of the TMPRSS2 gene with the frequency of cancer and its prognosis. 16 Although there are studies that have found that the TM-PRSS2 gene does not represent a worse prognosis for prostate cancer, 17 an important fusion of that gene with the ERG gene was described, with an increasing relation to the diagnosis and aggressiveness of prostate cancer (present in 50% of high-risk prostate cancers). [18][19] We found a low frequency of the allelic variant associated with the TMPRSS2:ERG fusion gene in our cancer-free population from Southwestern Colombia. The rs12329760 variant, albeit not the most frequent SNP found in the present study, is reported in the literature to have a non-negligible allele frequency (AF) in populations from East Asia and Northern Europe (0.38 and 0.37, respectively), with a major homozygote ratio (> 7%). Frequency in the Hispanic population is 0.155, with a low number of homozygotes (20). It should be noted that Southwestern Colombia has a large population of African descent, in which a higher frequency of said allelic variant (0.29) has been identified. That is an important fact to keep in mind when identifying new biomarkers for prostate cancer. 20 The most sequenced polymorphism in the present study was rs140530035. It is a very common intron in the world population and the allele frequency of that variant reaches 0.9. 21 Comparing populations, inhabitants of northern Europe (Finland) have an AF of 0.99, whereas it is only 0.66 in the so-called Latino population, according to Lek et al. 21 The rs17854725 and rs2298659 polymorphisms are synonymous variants that are rare in the Latin American population, according to the literature, with an AF of 0.15 or less, and they have no known pathologic associations. Likewise, the rs75603675 polymorphism is not known to be associated with any pathology. 21

Strengths and limitations
The present study is the first to describe the relation of the TMPRSS2 gene and its allelic variants to a Southwestern Colombian cancer-free population. An advantage of the project was the quality of the study's samples, analyses, and data. Several variants associated with the TMPRSS2 gene were identified. That is very important information for the performance of future longitudinal studies in cancer-free patients to determine the risk for that disease.
A limitation of the present study was the fact that we did not find any information associated with the presence of a pathologic relationship to prostate cancer.