The Resource Accuracy of data extraction of non-English language trials with Google Translate, investigators, Ethan M. Balk ... [et al.], (electronic resource)

Accuracy of data extraction of non-English language trials with Google Translate, investigators, Ethan M. Balk ... [et al.], (electronic resource)

Label
Accuracy of data extraction of non-English language trials with Google Translate
Title
Accuracy of data extraction of non-English language trials with Google Translate
Statement of responsibility
investigators, Ethan M. Balk ... [et al.]
Contributor
Subject
Language
  • eng
  • eng
Summary
BACKGROUND: Systematic review prides itself on inclusion of all relevant evidence. However, study eligibility is often restricted to English language for practical reasons. Google Translate, a free Web-based resource for translation, has recently become available. However, it is unknown whether its translation accuracy is sufficient for Evidence-based Practice Center (EPC) systematic reviews. Therefore, we formally evaluated the accuracy of Google Translate for the purpose of data extraction of non-English language articles. METHODS: We retrieved 10 randomized controlled trials (RCTs) in eight languages (Chinese, French, German, Italian, Japanese, Korean, Portuguese, and Spanish) and eight observational studies in Hebrew. Eligible studies were RCTs that reported per-treatment group results data (except for Hebrew language studies, where no RCTs were identified). Each article was translated into English using Google Translate. The time required to translate each study was tracked. Data from the original language versions of the articles were extracted by one of 10 fluent speakers who were current or former members of our EPC. The English translated versions of the articles were extracted by one of five current EPC researchers who did not speak the given language. These five researchers also double data extracted 10 English language RCTs. Data extracted included: eligibility criteria, treatment description, study descriptors, quality issues, outcome description, and results. Extractors were also asked to estimate how much extra time was required for extraction compared to a similar English language article. For each study, pairs of data extractions were compared for agreement of each extracted item. We analyzed the percent agreement within sets of studies in each language for each extraction item and for groups of extraction items. We defined "high agreement" as at least 80 percent agreement within an item or article. The degree of agreement for each language was compared with that of the English language study comparisons with nonparametric tests. RESULTS: The length of time required to translate articles ranged from seconds (51 articles, 58 percent) to about 1 hour. Assessment by the English language data extractors indicated that "a little" extra time was required for 40 articles (45 percent) and "a lot" for 42 (48 percent). When evaluating all extraction items together, Portuguese and German articles had the best agreement between original and translated extractions, with high agreement between extractors among about 60 percent of the items, compared with 80 percent in English articles. Spanish, Hebrew, and Chinese had the lowest agreement (30 percent, 24 percent, and 8 percent, respectively). The absolute agreement and the proportion of items with high agreement were statistically significantly worse for all languages, compared with English. Eight of 10 English language articles had high agreement for all items; compared with 7 of 10 Portuguese articles; 6 of 10 German articles; 4 of 10 French, Italian, and Korean; 3 of 8 Hebrew articles; 3 of 10 Japanese and Spanish articles; but no Chinese articles. CONCLUSION: Translation was not always possible, but generally required few resources. Across all languages, data extraction from translated articles was less accurate than from English language articles. Accurate extraction was possible for some articles in all languages, except Chinese, with Portuguese and German articles yielding the most accurate extractions. Use of Google Translate has the potential of being an approach to reduce language bias; however, reviewers may need to be more cautious about using data from these translated articles
Member of
Cataloging source
DNLM
Funding information
Prepared for: Agency for Healthcare Research and Quality, U.S. Department of Health and Human Services, 540 Gaither Road, Rockville, MD 20850; www.ahrq.gov Contract No. 290-2007-10055 I, Prepared by: Tufts Evidence-based Practice Center, Tufts Medical Center, Boston, MA
Government publication
federal national government publication
Illustrations
illustrations
Index
no index present
Literary form
non fiction
Nature of contents
  • dictionaries
  • bibliography
NLM call number
W 26.55.C7
http://library.link/vocab/relatedWorkOrContributorName
  • Balk, Ethan
  • United States
  • Tufts Evidence-based Practice Center
Series statement
  • Methods research report
  • AHRQ publication
Series volume
no. 12-EHC056-EF
http://library.link/vocab/subjectName
  • Internet
  • Translating
  • Information Services
  • Communication Barriers
Label
Accuracy of data extraction of non-English language trials with Google Translate, investigators, Ethan M. Balk ... [et al.], (electronic resource)
Instantiates
Publication
Note
"April 2012."
Bibliography note
Includes bibliographical references
Color
multicolored
Dimensions
unknown
Extent
1 online resource (PDF file (various pagings))
Form of item
online
Other physical details
ill.
Specific material designation
remote
System control number
  • 1585928
  • (DNLM)BKSHLF:NBK95238
Label
Accuracy of data extraction of non-English language trials with Google Translate, investigators, Ethan M. Balk ... [et al.], (electronic resource)
Publication
Note
"April 2012."
Bibliography note
Includes bibliographical references
Color
multicolored
Dimensions
unknown
Extent
1 online resource (PDF file (various pagings))
Form of item
online
Other physical details
ill.
Specific material designation
remote
System control number
  • 1585928
  • (DNLM)BKSHLF:NBK95238

Library Locations

  • African Studies LibraryBorrow it
    771 Commonwealth Avenue, 6th Floor, Boston, MA, 02215, US
    42.350723 -71.108227
  • Alumni Medical LibraryBorrow it
    72 East Concord Street, Boston, MA, 02118, US
    42.336388 -71.072393
  • Astronomy LibraryBorrow it
    725 Commonwealth Avenue, 6th Floor, Boston, MA, 02445, US
    42.350259 -71.105717
  • Fineman and Pappas Law LibrariesBorrow it
    765 Commonwealth Avenue, Boston, MA, 02215, US
    42.350979 -71.107023
  • Frederick S. Pardee Management LibraryBorrow it
    595 Commonwealth Avenue, Boston, MA, 02215, US
    42.349626 -71.099547
  • Howard Gotlieb Archival Research CenterBorrow it
    771 Commonwealth Avenue, 5th Floor, Boston, MA, 02215, US
    42.350723 -71.108227
  • Mugar Memorial LibraryBorrow it
    771 Commonwealth Avenue, Boston, MA, 02215, US
    42.350723 -71.108227
  • Music LibraryBorrow it
    771 Commonwealth Avenue, 2nd Floor, Boston, MA, 02215, US
    42.350723 -71.108227
  • Pikering Educational Resources LibraryBorrow it
    2 Silber Way, Boston, MA, 02215, US
    42.349804 -71.101425
  • School of Theology LibraryBorrow it
    745 Commonwealth Avenue, 2nd Floor, Boston, MA, 02215, US
    42.350494 -71.107235
  • Science & Engineering LibraryBorrow it
    38 Cummington Mall, Boston, MA, 02215, US
    42.348472 -71.102257
  • Stone Science LibraryBorrow it
    675 Commonwealth Avenue, Boston, MA, 02445, US
    42.350103 -71.103784
Processing Feedback ...