info@techdevops.com | 437-991-3573 | Data Engineering Services
TechDevOps.com
Resources Tools
Experts in Microsoft SQL Server on Windows, Linux, Containers | Clusters, Always On, FCI | Migrations, Cloud, Performance



T-SQL - Extract out the Date within File Names
by BF (Principal Consultant; Architecture; Engineering)
2016-12-05








Data Prep:

INSERT INTO Photos SELECT 1000, 1, '\\FILES\IMAGES\INDIVIDUAL\Design01_20160101.jpg'
INSERT INTO Photos SELECT 1000, 2, '\\FILES\IMAGES\INDIVIDUAL\Design01_20160101.jpg'
INSERT INTO Photos SELECT 1000, 3, '\\FILES\IMAGES\INDIVIDUAL\Design01_20160101.jpg'
INSERT INTO Photos SELECT 1000, 4, '\\FILES\IMAGES\INDIVIDUAL\Design01_20160102.jpg'
INSERT INTO Photos SELECT 1000, 5, '\\FILES\IMAGES\INDIVIDUAL\Design01_20160107.jpg'


Task: Extract out the date into a separate field for analysis


Solution: SUBSTRING & PATINDEX Functions with a Regular Expression

Use the PATINDEX(Pattern Index Function), with a REGEX, to find a pattern which will indicate the Start Position for the SUBSTRING Function.
(i.e. you can use the PATINDEX function to look for a specific pattern and not only just a specific string as in many
use cases).

SELECT
IndividualID,
PhotoID,
PhotoPath,
SUBSTRING(PhotoPath, PATINDEX('%[2][0][0-9][0-9][0-9][0-9][0-9][0-9]%', PhotoPath), 8) AS 'PhotoPathExtractedDate'
FROM dbo.Photos





Code Assumptions:
- The Date format is always {YearMonthDay}: ex. 20160101 (Integer)
- The Year must start with [2][0]
- The Year is 4 characters in length, Month is 2 characters and Day is 2 characters


Resources:

PATINDEX (Transact-SQL)

SUBSTRING (Transact-SQL)

Comparing CHARINDEX and PATINDEX