A comparison of database systems for XML-type data

J.E. Risse, J.A.M. Leunissen

Research output: Contribution to journalArticleAcademicpeer-review

2 Citations (Scopus)


Background: In the field of bioinformatics interchangeable data formats based on XML are widely used. XML-type data is also at the core of most web services. With the increasing amount of data stored in XML comes the need for storing and accessing the data. In this paper we analyse the suitability of different database systems for storing and querying large datasets in general and Medline in particular. Results: All reviewed database systems perform well when tested with small to medium sized datasets, however when the full Medline dataset is queried a large variation in query times is observed. Conclusions: There is not one system that is vastly superior to the others in this comparison and, depending on the database size and the query requirements, different systems are most suitable. The best all-round solution is the Oracle 11~g database system using the new binary storage option. Alias-i's Lingpipe is a more lightweight, customizable and sufficiently fast solution. It does however require more initial configuration steps. For data with a changing XML structure Sedna and BaseX as native XML database systems or MySQL with an XML-type column are suitable.
Original languageEnglish
Pages (from-to)193-205
JournalIn Silico Biology
Issue number3-4
Publication statusPublished - 2010


Dive into the research topics of 'A comparison of database systems for XML-type data'. Together they form a unique fingerprint.

Cite this