In this paper we address the problem of classifying multidimensional time-evolving data in dynamic scenes. To take advantage of the correlation between the different channels of data, we introduce a generalized form of a stabilized higher-order linear dynamical system (sh-LDS) and we represent the multidimensional signal as a third order tensor. In addition, we show that the parameters of the proposed model lie on a Grassmann manifold and we attempt to address the classification problem through the study of the geometric properties of the sh-LDS's space. Moreover, to tackle the problem of non-linearity of the observation data, we represent each multidimensional signal as a cloud of points on the Grassmann manifold and we create a codebook by identifying the most representative points. Finally, each multidimensional signal is classified by applying a bag-of-systems approach having first modeled the variation of the class of each codeword on its tangent space instead of the sh-LDS's space. The proposed methodology is evaluated in three different application domains, namely video-based surveillance systems, dynamic texture categorization and human action recognition, showing its great potential.