Monday, 16 March, 2026/Edition 102

Machine Learning (ML) is a type of AI that uses algorithms to train computers to learn from empirical data, make predictions, and improve data science applied to specific problems. ML approaches are being increasingly applied to various problems in geoscience.

This week, we will look at some recent ML applications to tight oil shale formations — a timely development as oil and gas production practices within shale plays need huge improvements.

Rasoul Sorkhabi

Editor, Core Elements

How to Identify Shale Sweet Spots

Horn River Shale/ Wikimedia Commons

Identifying the most productive areas in shale formations is critical for prospectivity and producibility of shale oil and gas plays.

Nandito Davy and colleagues have used geological data and ML methods to identify shale gas sweet spots in Canada. Their paper is published in Unconventional Resources. Let’s see what they did.

Study area:

Datasets from three wells drilled in Horn River Formation in Northeastern British Columbia, Canada were used for the study.
The Horn River shale formation is of Middle-to-Upper Devonian age and consists of three organic-rich members deposited in a marine environment.

Rock-physics templates: The researchers constructed rock physics templates for the following data sets:

Total organic carbon (TOC) measured from cuttings
Mineralogical-based brittleness index (MBI) calculated from mineralogical and porosity analysis of samples
Gas saturation calculated from neutron porosity and deep resistivity well logs

Five elastic parameters:

Primary-wave impedance
Share-wave impedance
Primary-shear wave velocity ratio
Density multiplied by the Lame first parameter (related to incompressibility)
Density multiplied by the Lame second parameter (related to rigidity)

These five parameters were obtained from density, compressional sonic, and shear sonic logs.

ML models: Statistical analysis of cross-plots of the five elastic parameters, plus the three quality factors, was used to optimize rock-physics templates.

The following ML algorithms were used to train and test the datasets:

Artificial Networks (ANNs)
Extreme Gradient Boosting (XGB)
Logistic Regression (LR)

What they found:

Primary-wave impedance, shear-wave impedance, and primary-shear wave velocity ratio rank as the most important elastic parameters for identifying sweet spots in both rock-physics templates and ML analyses.
ML-based approaches outperform rock-physics templates by capturing multidimensional and non-linear relationships between elastic parameters.
Among the ML algorithms, logistic regression (RL) and XGB offer an effective balance between predictive performance and computational efficiency.
RL achieved a validation score of 0.801 compared to the rock-physics template performance of 0.746.

Go deeper: Read the paper here.

A message from AAPG

With data centers and AI becoming increasingly prevalent, your team needs the tools to successfully navigate the future of the energy sector, when carbon management and development will be more vital than ever.

Save your seat and join geoscientists and engineers to experience three days of key lessons learned, global connections, and real-world applications driving progress across the full CCUS value chain.

Register for CCUS taking place in The Woodlands, Texas 30 March–1 April.

REGISTER NOW

Using ML to Forecast Production Performance

Bakken Play/ USGS

Yeonpyeong Jo and colleagues have reported an ML application to oil production performance in the Bakken Formation. Their paper is published in Unconventional Resources.

Study area:

The Bakken Formation consists of a middle member (dolomitic siltstone) sandwiched between two organic-rich shale members.
The formation was deposited during the Late Devonian to Early Mississippian under deep marine conditions.
The Bakken in the Williston Basin spans parts of North Dakota and Montana.

What they did:

The researchers collected data from more than 2,000 horizontal wells. Datasets were split into 80 percent training and 20 percent test data.
The predictor factors included geographic coordinates, vertical and lateral well lengths, the number of fracturing stages, and the total volumes of water and proppants for each fracturing stage.
The prediction target was the production performance of wells, including cumulative production and four normalized production indices (production barrels per production month per lateral foot per fracturing stage).

ML models:

Random Forest, XGBoost, and Multilayer Perception were applied.
Heat maps and SHAP (Shapley Additive Explanation) figures were constructed for evaluating and ranking the feature importance of various predictors.

What they found:

The geographic location of wells emerged as a universal control factor, with the eastern-central region of the Williston Basin showing superior production performance.
Water and proppant management strategy ranked the next most important control factor.
Among the ML models, XGBoost ranked as the best-performing model for all scenarios.

Go deeper: Read the full paper here.

Call For Expression of Interest

Licensing round of nine free blocks of the Cameroon oil and gas domain.

Deadline for submission is 30 March 2026.

LEARN MORE

ML for Mineralogical Brittleness and Pore Types

Marcellus Shale Bank/Wikimedia Commons

Two recent studies have targeted specific rock properties. Let’s take a look.

Mineralogical Brittleness

The response of shale to hydraulic fracturing depends on the rock’s brittleness. Two methods are usually used to estimate the brittleness index based on mineralogical (composition) and geomechanical (elastic) properties.

Wang and colleagues in the Journal of Energy Engineering have reported an ML approach to estimate mineralogical brittleness.

What they did:

The researchers collected 55 rock samples from the Marcellus Shale in the Appalachian Basin.
They determined the mineralogical composition of the samples by the X-ray fluorescence (XRF) technique.

What they found:

Prediction results from three ML algorithms were compared: Neural Network, Random Forest, and Ensemble Model.
The ensemble model trained by empirical data predicted mineralogical brittleness of test samples with an accuracy of up to 83 percent.

Prediction of Pore Characteristics

Peng and Periwal present a deep-learning workflow to characterize pore type distribution in tight reservoir rocks. They published their work in Fuel.

What they did:

The researchers used four samples from the Bone Spring Formation in the Delaware Basin and one sample from Eagle Ford Formation.
They conducted high-resolution imaging of these samples using large-area scanning electron microscopy (SEM).
Deep Learning model training was conducted using Avizo software (Thermo Fisher Version 2023).
Datasets used for training and testing (validation) were respectively 75 and 25 percent.

What they found:

A stepwise classification algorithm characterized three pore types:
1. Organic matter-lined pores
2. Clean mineral pore
3. Intraparticle (fluid inclusion) pores
Pore size distribution obtained from large-area imagery was compared with that of those derived from mercury intrusion capillary pressure measurements.

The bottom line:

Researchers have only just begun to use ML models to predict rock properties in shale formations based on empirical data, but they have a long way to go.
Cumulative capabilities derived from the ML models will contribute to reservoir quality assessment of shale plays.

Rasoul Sorkhabi

Editor, Core Elements

How to Identify Shale Sweet Spots

Using ML to Forecast Production Performance

Call For Expression of Interest

ML for Mineralogical Brittleness and Pore Types

Want to help AAPG grow?

Consider supporting AAPG's free resources, like this one, by donating today.