не может прочитать весь абстрактный текст из опубликованного XML-файла

Я загружаю XML-файл PubMed и хочу распечатать всю статью в этом файле, вот мой код

import xml.etree.ElementTree as ET
tree = ET.parse('test1.xml')
root = tree.getroot()
for abs_1 in root.findall("PubmedArticle/MedlineCitation/Article/Abstract"):
    abs_2 = abs_1.find('AbstractText').text
    print(abs_2)

однако я получаю только объективную часть резюме. который помечен как <AbstractText Label="AIM" NlmCategory="OBJECTIVE"> , я не понимаю две другие части, которые также находятся внутри <Abstract>.

например, XML получил что-то вроде этого

<Abstract>
<AbstractText Label="AIM" NlmCategory="OBJECTIVE">The level of preparedness of the healthcare system plays an important role in management of coronavirus disease 2019 (COVID-19). This study attempted to devise a comprehensive protocol regarding dental care during the COVID-19 outbreak.</AbstractText>
<AbstractText Label="METHODS AND RESULT" NlmCategory="RESULTS">Embase, PubMed, and Google Scholar were searched until March 2020 for relevant papers. Sixteen English papers were enrolled to answer questions about procedures that are allowed to perform during the COVID-19 outbreak, patients who are in priority to receive dental care services, the conditions and necessities for patient admission, waiting room and operatory room, and personal protective equipment (PPE) that is necessary for dental clinicians and the office staff.</AbstractText>
<AbstractText Label="CONCLUSION" NlmCategory="CONCLUSIONS">Dental treatment should be limited to patients with urgent or emergency situation. By screening questionnaires for COVID-19, patients are divided into three groups of (a) apparently healthy, (b) suspected for COVID-19, and (c) confirmed for COVID-19. Separate waiting and operating rooms should be assigned to each group of patients to minimize the risk of disease transmission. All groups should be treated with the same protective measures with regard to PPE for the dental clinicians and staff.</AbstractText>
<CopyrightInformation>© 2020 Special Care Dentistry Association and Wiley Periodicals, Inc.</CopyrightInformation>
</Abstract>

Используя мой код, я получаю только

The level of preparedness of the healthcare system plays an important role in management of coronavirus disease 2019 (COVID-19). This study attempted to devise a comprehensive protocol regarding dental care during the COVID-19 outbreak.

действительно нужна помощь в том, как распечатать весь абстрактный текст внутри абстрактного


person lessthanenough    schedule 28.09.2020    source источник


Ответы (1)


Когда вы можете .findall() <Abstract> элементов, не будет ли логичным, чтобы вы могли .findall() <AbstractText> элементов таким же образом?

import xml.etree.ElementTree as ET

tree = ET.parse('test1.xml')
root = tree.getroot()

for AbstractText in root.findall("PubmedArticle/MedlineCitation/Article/Abstract/AbstractText"):
    print(AbstractText.text)
person Tomalak    schedule 28.09.2020