Software Engineer, Equity Strategist, Polymath.
Posts tagged occurrence
Find Character occurrence in String
015 years
by Ray Tawil
in MS Sql Server
I made this query to answer a challenge given from a programming forum. After the query is done I realized that I’ve made a mistake. And it was quite a mistake. The challenge ask you to list all the character that occur in sequence. What I made is a query to list the most sequential occurrence character..
I know this query is far from perfect. But just let me share it with you.
this is the data source
Data ------------------------------------ 9992EDC6-D117-4DEE-B410-4E5FAE46AE97 0BFC936B-BD9A-4C6A-AFB2-CF3F1752F8B1 4A73E7EB-7777-4A04-9258-F1E75097977C 5AAF477C-274D-400D-9067-035968F33B19 725DA718-30D0-44A9-B36A-89F27CDFEEDE 8083ED5A-D3B9-4694-BB04-F0B09C588888 22244444-43B9-4694-BB04-F0B098888888
expected result :
Data pattern len pattern2 len2 ---------------------------------------------------------------------------- 9992EDC6-D117-4DEE-B410-4E5FAE46AE97 999 3 NULL NULL 0BFC936B-BD9A-4C6A-AFB2-CF3F1752F8B1 NULL NULL NULL NULL 4A73E7EB-7777-4A04-9258-F1E75097977C 7777 4 NULL NULL 5AAF477C-274D-400D-9067-035968F33B19 AA 2 NULL NULL 725DA718-30D0-44A9-B36A-89F27CDFEEDE 44 2 NULL NULL 8083ED5A-D3B9-4694-BB04-F0B09C588888 88888 5 NULL NULL 22244444-43B9-4694-BB04-F0B098888888 8888888 7 NULL NULL DATA Char Pos Len ------------------------------------------------------------ 9992EDC6-D117-4DEE-B410-4E5FAE46AE97 9 1 3 0BFC936B-BD9A-4C6A-AFB2-CF3F1752F8B1 NULL NULL NULL 4A73E7EB-7777-4A04-9258-F1E75097977C 7 10 4 5AAF477C-274D-400D-9067-035968F33B19 A 2 2 725DA718-30D0-44A9-B36A-89F27CDFEEDE 4 15 2 8083ED5A-D3B9-4694-BB04-F0B09C588888 8 32 5 22244444-43B9-4694-BB04-F0B098888888 8 30 7
My query
Create table #t (Data VARCHAR(40), pattern varchar(50), [len] int, pattern2 varchar(50), len2 int ) INSERT #t (Data) SELECT '9992EDC6-D117-4DEE-B410-4E5FAE46AE97' INSERT #t (Data) SELECT '0BFC936B-BD9A-4C6A-AFB2-CF3F1752F8B1' INSERT #t (Data) SELECT '4A73E7EB-7777-4A04-9258-F1E75097977C' INSERT #t (Data) SELECT '5AAF477C-274D-400D-9067-035968F33B19' INSERT #t (Data) SELECT '725DA718-30D0-44A9-B36A-89F27CDFEEDE' INSERT #t (Data) SELECT '8083ED5A-D3B9-4694-BB04-F0B09C588888' INSERT #t (Data) SELECT '22244444-43B9-4694-BB04-F0B098888888' select * from #t declare @pattern varchar(20) set @pattern = null declare @maxlen int declare @e int declare @i varchar(2) declare @j varchar(2) set @maxlen = (select MAX(len(data)) from #t) print @maxlen declare @q nvarchar(4000) set @i = 1 set @j = 2 while (@i < 36) begin print @i set @q = ' update #t set pattern = right(isnull(pattern,''''),isnull(len(pattern),1)-1) + SUBSTRING(data, '+@i+', 1) + SUBSTRING(Data,'+@j+',1) where SUBSTRING(data, '+@i+', 1) = SUBSTRING(Data,'+@j+',1) and RIGHT(ISNULL(pattern, SUBSTRING(Data,'+@i+',1)),1) = SUBSTRING(Data,'+@i+',1) and isnull(substring(data, '+@i+'-[len]+1, 1),SUBSTRING(data, '+@i+', 1)) = SUBSTRING(data, '+@i+', 1) update #t set pattern2 = right(isnull(pattern2,''''),isnull(len(pattern2),1)-1) + SUBSTRING(data, '+@i+', 1) + SUBSTRING(Data,'+@j+',1) where SUBSTRING(data, '+@i+', 1) = SUBSTRING(Data,'+@j+',1) and RIGHT(ISNULL(pattern2, SUBSTRING(Data,'+@i+',1)),1) = SUBSTRING(Data,'+@i+',1) and isnull(substring(data, '+@j+'-[len2]+1, 1), SUBSTRING(data, '+@i+', 1)) = SUBSTRING(data, '+@i+', 1) and [len] != 0 update #t set [len] = len(pattern), len2 = len(pattern2) update #t set [pattern] = [pattern2], [len]= len2, pattern2 = NULL where len2 > [len] and SUBSTRING(data, '+@j+', 1) <> SUBSTRING(data, '+@j+'+1, 1) update #t set [pattern2] = NULL, [len2]= NULL where SUBSTRING(data, '+@j+', 1) <> SUBSTRING(data, '+@j+'+1, 1) ' print @q exec SP_executesql @q set @i +=1 set @j +=1 end select * from #t select DATA, LEFT(pattern,1) AS 'Char', PATINDEX('%'+pattern+'%', DATA) AS Pos, [Len] from #t
Basically what I am trying to do is compare the char with the following char. And Keep the pattern and if I found another sequential occurrence the most frequent will be recorded.