The importance of databases during pandemics: A study of GISAID during the COVID-19 pandemic
2022 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE credits
Student thesisAlternative title
Vikten av allmänt tillgängliga databaser. En studie av GISIADS roll under COVID-19 Pandemin (Swedish)
Abstract [en]
The GISAID database has swiftly become the world’s largest genome sequence database. As of May 2022, there is currently over 10 million genome sequence submissions to GISAID compared to 25 thousand in early 2020. This paper has the scientific goal in examining how integral the GISAID database has been in the COVID-19 response. This goal was achieved by analysing the citation rate of GISAID in the highest cited COVID-19 papers according to Google Scholar. As well as performing interview studies with relevant people in the field and doing a literature study of GISAID. The thesis that GISAID has been key in battling the pandemic could be verified in this project as the citation rate of GISAID increased during the pandemic in COVID-19 papers. Moreover, the database has been cited as crucial in the development of vaccines and in the extensive monitoring of the virus. The vast amount of data is however a double-edged sword as there are reports of problems regarding the data quality. This is due to the current format of inserting data, the lack of educational material for the use of the database and potential patient confidentiality related issues. Other problems reported are the fact that there is a big gap between countries on a global scale regarding both the collect to submission time as well as the number of genomes logged to the database. Overall, these issues should be looked at and adjusted through the combined efforts of GISAID, the worlds various governments and the World Health Organization. Despite these issues regarding GISAID, we concluded that it has, and will continue to play a critical role in the pandemic response.
Abstract [sv]
Databasen GISAID har snabbt vuxit till världens största databas för genomdata. I maj 2022 fanns det över 10 miljoner genomsekvenser uppladdade till databasen jämfört med de 25 tusen under början av 2020. Detta projekt har det vetenskapliga målet att undersöka hur väsentlig GISAID databasen har varit under kampen mot COVID-19. Detta mål uppnåddes genom att analysera citeringsfrekvensen av GISAID i de mest citerade coronartiklarna enligt sökmotorn Google Scholar. Utöver detta gjordes även en intervjustudie fokuserad på intervjupersoner med relevant expertis, samt en litteraturstudie av GISIAD-databasen. Tesen att GISAID har varit en nyckelspelare i åtgärderna mot COVID-19 kunde verifieras med hjälp av dessa tre studier då citeringsfrekvensen av databasen har stigit under pandemin. Databasen har även citerats i litteraturen som oumbärlig i utvecklingen av vaccin, diagnostisering, och smittspårning av viruset. Den enorma mängden data kommer dock även med nackdelar då det finns negativa rapporter angående kvalitén på data från GISAID. Detta beror på formatet för insättning av data, avsaknaden av instruktionsmanualler och problem relaterade till patientsekretess. Andra rapporterade problem är gällande de stora skillnaderna mellan länders tid för sekvensering av data till uppladdning, och totalt antal genomsekvenser inskickade. Dessa problem bör vidareundersökas och åtgärdas i samarbete mellan GISAID, världens regeringar och Världshälsoorganisationen WHO. Trots problemen med GISAID kan vi dra slutsatsen att databasen har, och kommer fortsätta spela en avgörande roll i kampen mot coronapandemin.
Place, publisher, year, edition, pages
2022. , p. 39
Series
TRITA-EECS-EX ; 2022:471
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:kth:diva-319747OAI: oai:DiVA.org:kth-319747DiVA, id: diva2:1701711
Subject / course
Computer Science
Educational program
Master of Science in Engineering - Computer Science and Technology
Supervisors
Examiners
2022-10-102022-10-072022-10-10Bibliographically approved