TY - GEN
T1 - ML-descent: An optimization algorithm for FWI using machine learning
AU - Sun, Bingbing
AU - Alkhalifah, Tariq Ali
N1 - KAUST Repository Item: Exported on 2020-10-01
Acknowledgements: We thank KAUST for the funding of this research and the members of SWAG group for useful discussions.
PY - 2019/8/10
Y1 - 2019/8/10
N2 - Full-Waveform Inversion is a nonlinear inversion problem, and a typical optimization algorithm such as nonlinear conjugate-gradient or LBFGS would iteratively update the model along gradient-descent direction of the misfit function or a slight modification of it. Rather than using a hand-designed optimization algorithm, we trained a machine to learn an optimization algorithm which we refer to as”ML-descent” and applied it in FWI. Using recurrent neural network (RNN), we use the gradient of the misfit function as input for training and the hidden states in the RNN uses the history information of the gradient similar to an BFGS algorithm. However, unlike the fixed BFGS algorithm, the ML version evolves as the gradient directs it to evolve.The loss function for training is formulated by summarization of the FWI misfit function by the L2-norm of the data residual. Any well-defined nonlinear inverse problem can be locally approximated by a linear convex problem, and thus, in order to accelerate the training speed, we train the neural network using the solution of randomly generated quadratic functions instead of the time-consuming FWI gradient. We use the Marmousi example to demonstrate that the ML-descent method outperform the steepest descent method, and the energy in the deeper part of the model can be compensable well by the ML-descent when the pseudo-inverse of the Hessian is not incorporated in the gradient of FWI.
AB - Full-Waveform Inversion is a nonlinear inversion problem, and a typical optimization algorithm such as nonlinear conjugate-gradient or LBFGS would iteratively update the model along gradient-descent direction of the misfit function or a slight modification of it. Rather than using a hand-designed optimization algorithm, we trained a machine to learn an optimization algorithm which we refer to as”ML-descent” and applied it in FWI. Using recurrent neural network (RNN), we use the gradient of the misfit function as input for training and the hidden states in the RNN uses the history information of the gradient similar to an BFGS algorithm. However, unlike the fixed BFGS algorithm, the ML version evolves as the gradient directs it to evolve.The loss function for training is formulated by summarization of the FWI misfit function by the L2-norm of the data residual. Any well-defined nonlinear inverse problem can be locally approximated by a linear convex problem, and thus, in order to accelerate the training speed, we train the neural network using the solution of randomly generated quadratic functions instead of the time-consuming FWI gradient. We use the Marmousi example to demonstrate that the ML-descent method outperform the steepest descent method, and the energy in the deeper part of the model can be compensable well by the ML-descent when the pseudo-inverse of the Hessian is not incorporated in the gradient of FWI.
UR - http://hdl.handle.net/10754/661903
UR - https://library.seg.org/doi/10.1190/segam2019-3215304.1
UR - http://www.scopus.com/inward/record.url?scp=85079486031&partnerID=8YFLogxK
U2 - 10.1190/segam2019-3215304.1
DO - 10.1190/segam2019-3215304.1
M3 - Conference contribution
SP - 2288
EP - 2292
BT - SEG Technical Program Expanded Abstracts 2019
PB - Society of Exploration Geophysicists
ER -