Construction and application of large language model for reservoir performance analysis

PAN Huanquan, LIU Jianqiao, GONG Bin, ZHU Yiheng, BAI Junhui, HUANG Hu, FANG Zhengbao, JING Hongbin, LIU Chen, KUANG Tie, LAN Yubo, WANG Tianzhi, XIE Tian, CHENG Mingzhe, QIN Bin, SHEN Yujiang

doi:10.11698/PED.20240208

Petroleum Exploration and Development >

0 20241008 - 20241008

DOI: https://doi.org/10.11698/PED.20240208

Construction and application of large language model for reservoir performance analysis

Expand

1. School of Earth Resources of China University of Geosciences, Wuhan 430074, China;
2. Heilongjiang Key Laboratory of Reservoir Physics & Fluid Mechanics in Porous Medium, Daqing 163712, China;
3. Exploration and Development Research Institute of PetroChina Daqing Oilfield Co., Ltd., Daqing 163712, China

Received date: 2024-03-31

Revised date: 2024-09-10

Online published: 2024-09-10

Fold

Abstract

The application framework leveraging large language models (LLMs) is explored to address the sophisticated demands of data retrieval and analysis, detailed well profiling, computation of key technical indicators, and the development of solutions in reservoir dynamic analysis (RDA). This framework encompasses a large language foundation model augmented with incremental pre-training, fine-tuning, and subsystems coupling. Key innovations in specialized fine-tuning technologies include named entity recognition (NER) based on prompt engineering, classification-based tool invocation, and Text-to-SQL construction, all aimed at resolving pivotal challenges in developing the specific application of LLMs for RDA. This study conducted a detailed accuracy test on feature extraction models, tool classification models, data retrieval models, and analysis recommendation models. The results indicate that these models have demonstrated good performance in various key aspects of reservoir dynamic analysis. The research takes some injection and production well groups in the real block of the PK3 Fault Block transition zone of the Daqing Oilfield as an example for testing. Testing results show that our model has significant potential and practical value in assisting reservoir engineers with RDA. The research results provide a powerful support to the application of LLM in reservoir performance analysis.

Key words： reservoir performance analysis; artificial intelligence large model; large language model; incremental pre-training; fine-tuning; subsystems coupling; entity recognition; tool invocation

Cite this article

PAN Huanquan, LIU Jianqiao, GONG Bin, ZHU Yiheng, BAI Junhui, HUANG Hu, FANG Zhengbao, JING Hongbin, LIU Chen, KUANG Tie, LAN Yubo, WANG Tianzhi, XIE Tian, CHENG Mingzhe, QIN Bin, SHEN Yujiang . Construction and application of large language model for reservoir performance analysis[J]. Petroleum Exploration and Development, 0 : 20241008 -20241008 . DOI: 10.11698/PED.20240208

References

[1] LIU J, CHANG H H, FORREST J Y L, et al. Influence of artificial intelligence on technological innovation: Evidence from the panel data of China's manufacturing sectors[J]. Technological Forecasting and Social Change, 2020, 158: 120142.
[2] FATIMA S, DESOUZA K C, DAWSON G S.National strategic artificial intelligence plans: A multi-dimensional analysis[J]. Economic Analysis and Policy, 2020, 67: 178-194.
[3] LI R, LIU C X.Artificial intelligence revolution: How AI will change our society, economy, and culture[M]. New York: Skyhorse, 2020.
[4] 眭纪刚. 人工智能开启创新发展新时代[J]. 人民论坛, 2024(1): 66-71.
SUI Jigang.Artificial intelligence opens a new era of innovation and development[J]. People's Tribune, 2024(1): 66-71.
[5] VASWANI A, SHAZEER N, PARMAR N, et al.Attention is all you need[C]//GUYON I, VON LUXBURG U, BENGIO S, et al. Advances in Neural Information Processing Systems 30 (NIPS 2017). Red Hook: Curran Associates, Inc., 2017: 5998-6008.
[6] WOLF T, DEBUT L, SANH V, et al.Transformers: State-of-the-art natural language processing[C]//LIU Q, SCHLANGEN D. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Stroudsburg, PA: Association for Computational Linguistics, 2020: 38-45.
[7] OSCO L P, WU Q S, DE LEMOS E L, et al. The Segment Anything Model (SAM) for remote sensing applications: From zero to one shot[J]. International Journal of Applied Earth Observation and Geoinformation, 2023, 124: 103540.
[8] NAVEED H, KHAN A U, QIU S, et al. A comprehensive overview of large language models[DB/OL]. (2024-04-09)[2024-07-10]. https://arxiv.org/abs/2307.06435.
[9] CHANG Y P, WANG, X, WANG J D, et al.A survey on evaluation of large language models[J]. ACM Transactions on Intelligent Systems and Technology, 2024, 15(3): 1-45.
[10] WU T Y, HE S Z, LIU J P, et al.A brief overview of ChatGPT: The history, status quo and potential future development[J]. IEEE/CAA Journal of Automatica Sinica, 2023, 10(5): 1122-1136.
[11] WAISBERG E, ONG J, MASALKHI M, et al.GPT-4: A new era of artificial intelligence in medicine[J]. Irish Journal of Medical Science, 2023, 192(6): 3197-3200.
[12] CHOWDHERY A, NARANG S R, DEVLIN J, et al.PaLM: Scaling language modeling with pathways[J]. Journal of Machine Learning Research, 2023, 24(1): 11324-11436.
[13] TOUVRON H, LAVRIL T, IZACARD G, et al. LLaMA: Open and efficient foundation language models[DB/OL]. (2023-02-27)[2024-07-10]. https://arxiv.org/abs/2302.13971.
[14] SUN Y, WANG S H, LI Y K, et al. ERNIE 2.0: A continual pre-training framework for language understanding[DB/OL]. (2019-11-21)[2024-07-10]. https://arxiv.org/abs/1907.12412.
[15] XIE S Y, YAO W L, DAI Y, et al. TencentLLMEval: A hierarchical evaluation of real-world capabilities for human-aligned LLMs[DB/OL]. (2023-11-09)[2024-07-10]. https://arxiv.org/abs/2311.05374.
[16] YANG A, XIAO B, WANG B, et al. Baichuan 2: Open large-scale language models[DB/OL]. (2023-09-20)[2024-07-10]. https://arXiv:2309.10305, 2023.
[17] DU Z, QIAN Y, LIU X, et al.GLM: General language model pretraining with autoregressive blank infilling[DB/OL].(2022-05-17)[2024-07-10]. https://arXiv:2103.10360, 2021.
[18] TAORI R, GULRAJANI I, ZHANG T, et al. Stanford Alpaca: An instruction-following LLaMa model[DB/OL]. (2023-09-15) [2024-07-10]. https://github.com/tatsu-lab/stanford_alpaca.
[19] 刘合, 任义丽, 李欣, 等. 油气行业人工智能大模型应用研究现状及展望[J]. 石油勘探与开发, 2024, 51(4): 910-923.
LIU He, REN Yili, LI Xin, et al.Research status and prospects of application of artificial intelligence mega-models in the oil and gas industry[J]. Petroleum Exploration and Development, 2024, 51(4): 910-923.
[20] 常德宽, 雍学善, 高建虎, 等. 油气地球物理多模态多任务智能大模型研究[C]//中国石油学会石油物探专业委员会. 第二届中国石油物探学术年会论文集(下册). 兰州: 中国石油勘探开发研究院西北分院, 2024: 380-383.
CHANG Dekuan, YONG Xueshan, GAO Jianhu, et al.Research on multimodal and multi-task intelligent mega-models in oil and gas geophysics[C]//Petroleum Geophysical Exploration Professional Committee of China Petroleum Society. Proceedings of the 2nd China Petroleum Geophysical Exploration Academic Annual Conference (Volume 2). Lanzhou: Northwest Branch of China Petroleum Exploration and Development Research Institute, 2024: 380-383.
[21] ROBERTS J. How powerful are decoder-only transformer neural models?[DB/OL]. (2024-02-02)[2024-07-10]. https://arxiv.org/abs/2305.17026.
[22] KRUENGKRAI C, YAMAGISHI J. Mitigating the diminishing Effect of Elastic Weight Consolidation[EB/OL]. (2022-10-12) [2024-07-10]. https://aclanthology.org/2022.coling-1.403.
[23] KIRKPATRICK J, PASCANU R, RABINOWITZ N, et al. Overcoming catastrophic forgetting in neural networks[DB/OL]. (2017-01-25)[2024-07-10]. https://arXiv:1612.00796.
[24] HU E J, SHEN Y L, WALLIS P, et al. LoRA: Low-rank adaptation of large language models[DB/OL]. (2021-10-16)[2024-07-10]. https://arxiv.org/abs/2106.09685.
[25] DAULL X, BELLOT P, BRUNO E, et al. Complex QA and language models hybrid architectures, survey[DB/OL]. (2023-04-07) [2024-07-10]. https://arxiv.org/abs/2302.09051.
[26] BARABUCCI G, SHIA V, CHU E, et al. Combining insights from multiple large language models improves diagnostic accuracy[DB/OL]. (2024-02-13)[2024-07-10]. https://arxiv.org/abs/2402.08806.
[27] DUNN A, DAGDELEN J, WALKER N, et al. Structured information extraction from complex scientific text with fine-tuned large language models[DB/OL]. (2022-12-10)[2024-07-10]. https://arxiv.org/abs/2212.05238.
[28] YU T, ZHANG R, YANG K, et al. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-SQL task[DB/OL]. (2019-02-02)[2024-07-10]. https://arxiv.org/abs/1809.08887.

Options

Abstract

Outlines

模态框（Modal）标题

Abstract

Cite this article

References