TY - JOUR
T1 - Self-Supervised Pre-training Vision Transformer with Masked Autoencoders for Building Subsurface Model
AU - Li, Yuanyuan
AU - Alkhalifah, Tariq Ali
AU - Huang, Jianping
AU - Li, Zhenchun
N1 - KAUST Repository Item: Exported on 2023-09-01
Acknowledgements: This work was supported in part by KAUST, in part by the Marine ST Fund of Shandong Province for Pilot National Laboratory for Marine Science and Technology (Qingdao) under Grant 2021QNLM020001, and in part by the Marine ST Fund of Shandong Province for Pilot National Laboratory for Marine Science and Technology (Qingdao) under Grant 2021QNLM020001. We would like to thank KAUST for its support. We appreciate the fruitful discussions and collaborative environment in our SWAG group. We thank Statoil ASA and the Volve license partners ExxonMobil E&P Norway AS and Bayerngas Norge AS, for the release of the Volve data. The real data shown in the last example are provided by courtesy of CGG.
PY - 2023/8/28
Y1 - 2023/8/28
N2 - Building subsurface models is a very important but challenging task in hydrocarbon exploration and development. The subsurface elastic properties are usually sourced from seismic data and well logs. Thus, we design a deep learning (DL) framework using Vision Transformer (ViT) as the backbone architecture to build the subsurface model using well log information as we apply full waveform inversion (FWI) on the seismic data. However, training a ViT network from scratch with limited well log data can be difficult to achieve good generalization. To overcome this, we implement an efficient self-supervised pre-training process using a masked autoencoder (MAE) architecture to learn important feature representations in seismic volumes. The seismic volumes required by the pre-training are randomly extracted from a seismic inversion, such as an FWI result. We can also incorporate reverse time migration (RTM) image into the seismic volumes to provide additional structure information. The pre-training task of MAE is to reconstruct the original image from the masked image with a masking ratio of 75%. This pre-training task enables the network to learn the high-level latent representations. After the pre-training process, we then fine-tune the ViT network to build the optimal mapping relationship between 2D seismic volumes and 1D well segments. Once the fine-tuning process is finished, we apply the trained ViT network to the whole seismic inversion domain to predict the subsurface model. At last, we use one synthetic data set and two field data sets to test the performance of the proposed method. The test results demonstrate that the proposed method effectively integrates seismic and well information to improve the resolution and accuracy of the velocity model.
AB - Building subsurface models is a very important but challenging task in hydrocarbon exploration and development. The subsurface elastic properties are usually sourced from seismic data and well logs. Thus, we design a deep learning (DL) framework using Vision Transformer (ViT) as the backbone architecture to build the subsurface model using well log information as we apply full waveform inversion (FWI) on the seismic data. However, training a ViT network from scratch with limited well log data can be difficult to achieve good generalization. To overcome this, we implement an efficient self-supervised pre-training process using a masked autoencoder (MAE) architecture to learn important feature representations in seismic volumes. The seismic volumes required by the pre-training are randomly extracted from a seismic inversion, such as an FWI result. We can also incorporate reverse time migration (RTM) image into the seismic volumes to provide additional structure information. The pre-training task of MAE is to reconstruct the original image from the masked image with a masking ratio of 75%. This pre-training task enables the network to learn the high-level latent representations. After the pre-training process, we then fine-tune the ViT network to build the optimal mapping relationship between 2D seismic volumes and 1D well segments. Once the fine-tuning process is finished, we apply the trained ViT network to the whole seismic inversion domain to predict the subsurface model. At last, we use one synthetic data set and two field data sets to test the performance of the proposed method. The test results demonstrate that the proposed method effectively integrates seismic and well information to improve the resolution and accuracy of the velocity model.
UR - http://hdl.handle.net/10754/693907
UR - https://ieeexplore.ieee.org/document/10231372/
U2 - 10.1109/tgrs.2023.3308999
DO - 10.1109/tgrs.2023.3308999
M3 - Article
SN - 0196-2892
SP - 1
EP - 1
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
ER -