multimodal models understand time-series data better through their vision encoders than through the textual representation of long arrays of floating point numbers
Daswani, Mayank, et al. "Plots Unlock Time-Series Understanding in Multimodal Models." arXiv preprint
doi.org/10.48550/arX...Plots Unlock Time-Series Understanding in Multimodal Models
While multimodal foundation models can now natively work with data beyond text, they remain underutilized in analyzing the considerable amounts of multi-dimensional time-series data in fields like hea...