Archives

  • 2018-07
  • 2019-04
  • 2019-05
  • 2019-06
  • 2019-07
  • 2019-08
  • 2019-09
  • 2019-10
  • 2019-11
  • 2019-12
  • 2020-01
  • 2020-02
  • 2020-03
  • 2020-04
  • 2020-05
  • 2020-06
  • 2020-07
  • 2020-08
  • 2020-09
  • 2020-10
  • 2020-11
  • 2020-12
  • 2021-01
  • 2021-02
  • 2021-03
  • 2021-04
  • 2021-05
  • 2021-06
  • 2021-07
  • 2021-08
  • 2021-09
  • 2021-10
  • 2021-11
  • 2021-12
  • 2022-01
  • 2022-02
  • 2022-03
  • 2022-04
  • 2022-05
  • 2022-06
  • 2022-07
  • 2022-08
  • 2022-09
  • 2022-10
  • 2022-11
  • 2022-12
  • 2023-01
  • 2023-02
  • 2023-03
  • 2023-04
  • 2023-05
  • 2023-06
  • 2023-07
  • 2023-08
  • 2023-09
  • 2023-10
  • 2023-11
  • 2023-12
  • 2024-01
  • 2024-02
  • 2024-03
  • 2024-04
  • The optimum parameter sets were detected by the

    2022-08-08

    The optimum parameter sets were detected by the highest J-statistic for the training set and applied to the test set for evaluation. J-statistics obtained from the test set is reported at Table 2 along with respective parameters. We also applied QP, G4H and PQSF methods, in order to compare the prediction quality, based on J-statistic (Table 2). Two separate QP algorithms were tested, where the minimum length of the G-tracts were set as 2 and 3 (QP G2+ and G3+, respectively) against varying maximum loop length. For both, the maximum J-statistic is achieved when the maximum loop length is 30 nt. For G4H, the window size was fixed at 20 and the best threshold parameter with the highest J-statistic was detected at 1.2 for the test set. Default parameters were used for PQSF. The TPR, FPR, and discrete parameters of the pattern that yields the highest J-statistic are listed in Table 2. J-statistics indicate that PQSF is comparable to G4C, however, G4C may be preferable when lower FPR, and hence, higher specificity is required. When the models of G3 + E3 + XX and G3 + E-XX were respectively compared to models of G2 + E3 + XX, G2 + E-XX with identical G-tract rules, it calcium ionophore was revealed that permission of shorter G-tracts increases both TPR and FPR. However, the difference is more prominent when atypical G-tracts are allowed. Results indicate that allowance of the shorter G-tracts should be preferred only when atypical G-tracts are disallowed or high sensitivity is crucial. Extreme loop permissions: In every case, prohibition of an extreme loop, as in G2 + E3 + XX vs. G2 + E-XX or G2 + E2 + XX vs. G2 + E3 + XX, lead to decreased J-statistic of at least 0.06 points between models using identical atypical G-tract rules. This indicates that permission of an extreme loop should be preferred as a general rule. This is true for both G2GQs and G3+GQs. G-tract rules: The prohibition of atypical G-tracts lead to decreased J-statistic in all models, supporting that permission of atypical G-tracts is beneficial. It should be reminded that atypical G-tracts are only allowed for G3+GQs where the core stability is higher and not for G2GQs where only two G-tetrads are stacked. Among the atypical G-tracts, the highest J-statistic was obtained by models allowing bulge G-tracts. Nonetheless, the effect of the choice between models embracing different atypical G-tract rules seemed marginal, except where no atypical G-tract is allowed. Briefly, the J-statistic indicate that, in general, permission of atypical G-tracts improves the prediction model significantly, with 2B or 1B being the optimal choices. Shortly, for sensitivity (TPR) fixated applications, G2 + E3 + XX models provide the highest TPRs (up to 99%). On the other hand, where specificity (and low FPR) is essential, G3 + E3 + I1B may be preferred as it showed the lowest FPR while maintaining high TPR. When both sensitivity and specificity have equal weight, G3 + E3 + XX is preferable due to highest J-statistics, which put equal weight to both sensitivity and specificity. However, preferences may still be altered to suit different applications or conditions. Moreover, the choice of discrete parameters also affects the prediction quality. For the consideration of the discrete parameters, we used the complete reference dataset, isolated each parameter and surveyed against J-statistic. For G3+E- based models (G3 + E-XX), any G2GQs and extreme loops are disregarded so the only relevant parameter is G3+GQ loop maximum. Plotting J-statistic vs. G3+GQ loop maximum, revealed that increasing the G3+GQ loop maximum up to 7 nt, also increased J-statistic significantly, and continued to increase marginally beyond 7 nt (Fig. 1B). Although a longer loop maximum improved J-statistic for this particular reference dataset, it should be noted that this set doesn't represent long, non-GQ and double stranded DNA (dsDNA), well. The average length of non-GQ sequences in the dataset is shorter than GQ-forming sequences. However, elongated loops are actually expected to benefit double-stranded forms of DNA over GQs, especially on the genome. For these reasons we suggest limiting the G3+GQ loop to a maximum of 7 nt when scanning for GQs in genomes using G3 + E-XX models where no extreme loop is allowed.