text/html;charset=utf‐8 en On the shortcomings of array and scientific dataset support in relational databases relational,database,management,system,scientific,dataset,support,array index,follow global
Array support in the relational model

If you ask a computational scientist why he doesn’t store his data in a relational database—and he almost certainly doesn’t—he’ll probably tell you that the system doesn’t properly support arrays. When interrogated for the details, the main problems he’s likely to cite are high performance overheads, poor storage utilization, cumbersome interfaces to array data, and sparse support for the kinds of access patterns and primitives that are common in scientific data processing.

To me this situation seems curious because array structures are highly regular. Here none of the kinds of problems with representational power that are normally used to justify object stores and ad hoc flat file formats are to be seen. In fact most arrays used in the scientific context are extremely regular, with a neat, rectangular domain, and some real vector space represented as a tuple of floating point numbers as the range. That would seem to make them a straight forward application for relational algebra, and a highly restricted one at that. So where’s the problem?

I don’t think I’m too bad of a relational bigot, but it does appear to me that just about all of the reasons have to do with the limitations and emphases of commercial implementations of the relational model, and not so much the conceptual model itself. Namely,

If even some of these problems were rectified, I think relational databases would quickly become the platform of choice for scientific data management.