В сообщении «Холангиокарцинома: неясное взаимодействие» я упомянул, что сигнальный путь KEGG имеет недокументированное взаимодействие, как показано ниже.

CDKN2A* // MDM2

И я решил найти способ объяснить взаимодействие между этими двумя сущностями. Я делаю небольшое отступление, чтобы изучить способы использования обработки естественного языка (NLP) для извлечения и анализа текста, который можно использовать для этой цели.

В этом посте я расскажу, как получить представление о взаимодействии.

Подход

Для начала выложу что буду делать

1. Получите текст, содержащий объект, из KEGG, поскольку база данных KEGG предоставила вспомогательную статью (статьи) для каждой записи.

2. Используйте википедию в качестве графа знаний, чтобы обогатить атрибуты соответствующих сущностей.

3. Объединить тексты, собранные сверху, для анализа НЛП.

4. Постройте графики, используя предложения

5. Используйте сетевой анализ

Для первых трех шагов я уже описал свой подход в предыдущих сообщениях. Поэтому я не буду повторяться здесь, а просто продолжу собирать тексты об этих двух биомедицинских объектах.

Шаг 1: получить текст из базы данных KEGG

Вышеупомянутый процесс извлечет следующие тексты из базы данных KEGG.

{('CDKN2A',
  '8259215'): ['The division cycle of eukaryotic cells is regulated by a family of protein kinases known as the cyclin-dependent kinases (CDKs). The sequential activation of individual members of this family and their consequent phosphorylation of critical substrates promotes orderly progression through the cell cycle. The complexes formed by CDK4 and the D-type cyclins have been strongly implicated in the control of cell proliferation during the G1 phase. CDK4 exists, in part, as a multi-protein complex with a D-type cyclin, proliferating cell nuclear antigen and a protein, p21 (refs 7-9). CDK4 associates separately with a protein of M(r) 16K, particularly in cells lacking a functional retinoblastoma protein. Here we report the isolation of a human p16 complementary DNA and demonstrate that p16 binds to CDK4 and inhibits the catalytic activity of the CDK4/cyclin D enzymes. p16 seems to act in a regulatory feedback circuit with CDK4, D-type cyclins and retinoblastoma protein.'],
 ('CDKN2A',
  '9724636'): ['The two distinct proteins encoded by the CDKN2A locus are specified by translating the common second exon in alternative reading frames. The product of the alpha transcript, p16(INK4a), is a recognized tumour suppressor that induces a G1 cell cycle arrest by inhibiting the phosphorylation of the retinoblastoma protein by the cyclin-dependent kinases, CDK4 and CDK6. In contrast, the product of the human CDKN2A beta transcript, p14(ARF), activates a p53 response manifest in elevated levels of MDM2 and p21(CIP1) and cell cycle arrest in both G1 and G2/M. As a consequence, p14(ARF)-induced cell cycle arrest is p53 dependent and can be abrogated by the co-expression of human papilloma virus E6 protein. p14(ARF) acts by binding directly to MDM2, resulting in the stabilization of both p53 and MDM2. Conversely, p53 negatively regulates p14(ARF) expression and there is an inverse correlation between p14(ARF) expression and p53 function in human tumour cell lines. However, p14(ARF) expression is not involved in the response to DNA damage. These results place p14(ARF) in an independent pathway upstream of p53 and imply that CDKN2A encodes two proteins that are involved in tumour suppression.'],
 ('MDM2',
  '11351297'): ['MDM2 gene is overexpressed in several tumors and its product may be processed into different isoforms, some of which have been demonstrated to possess transforming activity. In a panel of liposarcomas characterized by displaying 4 different combinations of mdm2/p53 immunoreactivity, molecular analysis of amplified MDM2 gene revealed a coexistence of mutated full-length MDM2 messenger RNAs, an out-of-frame splicing mRNA and finally aberrant spliced forms. Two of the latter are reported here for the first time. The molecular differences in this heterogeneous mRNA population seem to mirror distinct functional aspects of the altered encoded mdm2 proteins. In fact, besides the deleted transcripts defective in their ability to bind p53 and known to possess a transforming activity, here we describe both mutated full-length forms and deleted transcripts that still maintain the ability to bind p53 but, based on their mdm2+/p53+ immunophenotype, probably fail to signal its degradation. These aberrant forms, which are responsible for the accumulation and inactivation of p53, can contribute, together with the p53 independent transforming forms, to liposarcoma transforming pathway.'],
 ('MDM2',
  '19965871'): ['The tumor suppressor p53 is a transcription factor that regulates cell cycle, DNA repair, senescence, and apoptosis in response to DNA damage. Phosphorylation of p53 at Ser-46 is indispensable for the commitment to apoptotic cell death. A previous study has shown that upon exposure to genotoxic stress, DYRK2 translocates into the nucleus and phosphorylates p53 at Ser-46, thereby inducing apoptosis. However, less is known about mechanisms responsible for intracellular control of DYRK2. Here we show the functional nuclear localization signal at N-terminal domain of DYRK2. Under normal conditions, nuclear and not cytoplasmic DYRK2 is ubiquitinated by MDM2, resulting in its constitutive degradation. In the presence of proteasome inhibitors, we detected a stable complex of DYRK2 with MDM2 at the nucleus. Upon exposure to genotoxic stress, ATM phosphorylates DYRK2 at Thr-33 and Ser-369, which enables DYRK2 to escape from degradation by dissociation from MDM2 and to induce the kinase activity toward p53 at Ser-46 in the nucleus. These findings indicate that ATM controls stability and pro-apoptotic function of DYRK2 in response to DNA damage.'],
 ('MDM2',
  '1614537'): ['Despite extensive data linking mutations in the p53 gene to human tumorigenesis, little is known about the cellular regulators and mediators of p53 function. MDM2 is a strong candidate for one such cellular protein; the MDM2 gene was originally identified by virtue of its amplification in a spontaneously transformed derivative of mouse BALB/c cells and the MDM2 protein subsequently shown to bind to p53 in rat cells transfected with p53 genes. To determine whether MDM2 plays a role in human cancer, we have cloned the human MDM2 gene. Here we show that recombinant-derived human MDM2 protein binds human p53 in vitro, and we use MDM2 clones to localize the human MDM2 gene to chromosome 12q13-14. Because this chromosomal position appears to be altered in many sarcomas, we looked for changes in human MDM2 in such cancers. The gene was amplified in over a third of 47 sarcomas, including common bone and soft tissue forms. These results are consistent with the hypothesis that MDM2 binds to p53, and that amplification of MDM2 in sarcomas leads to escape from p53-regulated growth control. This mechanism of tumorigenesis parallels that for virally-induced tumours, in which viral oncogene products bind to and functionally inactivate p53.']}

Антракт

Я остановлюсь здесь, чтобы рассказать о шаге 1. В следующем посте я расскажу, как я справляюсь с двусмысленностью сущностей, то есть на шаге 2, с Википедией.

Следите за обновлениями.