Reconfigurable intelligent surface (RIS) assisted wireless systems require accurate channel state information (CSI) to control wireless channels and improve overall network performance. However, CSI acquisition is non-trivial due to the passive nature of RIS, and the dimensions of the cascaded channel between transceivers increase with the large number of RIS elements, which requires high training overhead. Prior art has considered frequency-selective channel estimation without considering the beam squint effect in wideband systems, severely degrading channel estimation performance. This paper proposes a novel data-driven approach for estimating wideband cascaded channels of RIS-assisted multi-user millimeter-wave massive multiple-input multiple-output (MIMO) systems with limited training overhead, explicitly considering the effect of beam squint. To circumvent the beam squint effect, the proposed method exploits the common sparsity property among the different subcarriers as well as the double-structured sparsity property of the users’ angular cascaded channel matrices. The proposed data-driven cascaded channel estimation approach exploits denoising neural networks to detect channel supports accurately. Compared to beam squint effect agnostic traditional orthogonal matching pursuit (OMP) approaches, the proposed data-driven approach achieves 5-6dB less normalized mean square error (NMSE) and reduces the lower bound gap to only 1dB for the oracle least-square benchmark.