KMP演算法及應用&(hdu2087剪花布條 )&Power Strings (POJ2046)&Cyclic Nacklace(HDU3746)
KMP由Knuth(D.E.Knuth)、Morris(J.H.Morris)和Pratt(V.R.Pratt)三人設計的線性時間字串匹配演算法。所以叫做KMP。。。。。
字串匹配,就是從一個字串中查找出另一個字串所在位置,當然也可能出現查詢不到的情況。
比如給出目標字串 ss:
- abcabcabce
所要匹配的模式串 s:
- abcabce
當匹配到前6位是,都是成功的,但是到第7位就失敗了(ss[6]!=s[6]),我們通長的做法是回溯到ss的第2位,然後讓s重新開始從第一位匹配,這樣的時間複雜度明顯是n*m(n和m分別是ss串和s串的長度),所以就可以想能不能用其他方法解決。
以下是我的理解:
1.我們觀察模式串的特點:
前三位abc和第4-6位abc 是相同的,如果說當前6位都是匹配的,把s字串往後移動三位,是不是前三位也是匹配的。
雖然我們知道後面一定是匹配的,但是通常情況下,還是得讓計算機跑一遍,(雖然是無用功,但是形式還是要走滴,所以就有TLE……不扯了,扎心qaq)
2.再來看這一個字串
目標串(藍色)的每一位都不相同,但是依然匹配到第六位失敗,注意是每一位都不相同,所以當回溯到下一步的時候,很明顯,一定沒有一個字元是能匹配成功的,所以又做了無用功
3.額,,再看一對串
這個很有特點,前面的字元都是一樣的,只有最後一個不一樣,看到最後一個適配後,肯定知道,應該將模式字元向後移動6位嘛(但是計算機沒有人的智商,所以我們需要寫一個程式然她,他,它,聽話)
為了避免不必要的流程,讓時間複雜度降到最低,所以,K,M,P,三人總結出,—->KMP
- ####KMP是如何工作的
如果是通常演算法的話,需要16步比較,而kmp就只要。。。。
為什麼能過省略那麼多比較過程呢?———KMP的核心next[]跳轉表
其實,模式串往往含有一定的資訊——字首包含。
對於模式串而言,其字首字串,有可能也是模式串中的非字首子串,這個問題我稱之為字首包含問題。
如果不理解上述,,請忽略,看下面的解釋
如圖這個模式字串,後面三位和前面的三位相同,所以就可以姑且理解為字首包含。。
以模式串a b c a b c a c a b為例,其字首的4個字元a b c a,正好也是模式串的一個子串a b c (a b c a) c a b,所以當目標串與模式串執行匹配的過程中,如果直到第 8 個字元才匹配失敗,同時也意味著目標串當前字元之前的 4 個字元,與模式串的前 4 個字元是相同的,所以當模式串向後移動的時候,可以直接將模式串的第 5 個字元與當前字元對齊,執行比較,這樣就實現了模式串一次性向前跳躍多個字元。
next陣列就起到能一次性跳躍多個字元的作用
next中下標k含義:對於模式串的第 j 個字元 s[j],是所有滿足使 s[1···k-1] =s[j-(k-1)···j-1] (k < j) 成立的 k 的最大值。next[j]=k
簡單來說就是當前字尾字元中所包含的字首的最大值
先不管next怎麼用,先來看是怎麼得到的:
void getnex(int nex[],char *s)
{
int len=strlen(s);
int i,j=-1;
nex[0]=-1;
for(i=1;i<len;i++)
{
while(j>-1&&s[j+1]!=s[i])
j=nex[j];
if(s[j+1]==s[i])
j++;
nex[i]=j;
}
}
//這裡用nex表示next陣列,在c++中next已經變成保留字串,所以不能用next當做跳轉陣列名,否則報錯
還有這種寫法
void get_nex(int nex[],char *s)
{
int len=strlen(s);
int i=0,j=-1;
nex[0]=-1;
while(i!=len)
{
if(j==-1||s[i]==s[j])
nex[++i]=++j;
else j=nex[j];
}
}
//和上面那種寫法區別,建議自己試試
看next是如何完成跳轉的
int kmp(char *ss,char *s,int nex[])
{
getnex(nex,s);
int lens=strlen(s);
int lenss=strlen(ss);
int i,j=-1;
for(i=0;i<lenss;i++)
{
while(j>-1&&s[j+1]!=ss[i])
j=nex[j];
if(s[j+1]==ss[i])
j++;
if(j==lens-1)
{
return i-lens+1;//找到模式串,返回在在目標串第一個字元位置
}
}
}
具體看一個問題
問題(hdu2087剪花布條 )
一塊花布條,裡面有些圖案,另有一塊直接可用的小飾條,裡面也有一些圖案。對於給定的花布條和小飾條,計算一下能從花布條中儘可能剪出幾塊小飾條來呢?
Input
輸入中含有一些資料,分別是成對出現的花布條和小飾條,其布條都是用可見ASCII字元表示的,可見的ASCII字元有多少個,布條的花紋也有多少種花樣。花紋條和小飾條不會超過1000個字元長。如果遇見#字元,則不再進行工作。
Output
輸出能從花紋布中剪出的最多小飾條個數,如果一塊都沒有,那就老老實實輸出0,每個結果之間應換行。
Sample Input
abcde a3
aaaaaa aa
#
Sample Output
0
3
這個題的思路很明顯,就是要找出目標串中的“花紋字串”,雖然暴力能過,但還舉薦用kmp
#include<iostream>
#include<cstdio>
#include<algorithm>
#include<cstring>
#include<set>
#include<map>
#include<cmath>
using namespace std;
void getnex(char *s,int nex[])
{
int j=-1;
int len=strlen(s);
nex[0]=-1;
int i;
for(i=1;i<len;i++)
{
while(j>-1&&s[j+1]!=s[i])
j=nex[j];
if(s[j+1]==s[i])
j++;
nex[i]=j;
}
}
int kmp(char *ss,char *s,int nex[])
{
getnex(s,nex);
int i,j,lenss,lens;
lenss=strlen(ss);
lens=strlen(s);
j=-1;
int ans=0;
for(i=0;i<lenss;i++)
{
while(j>-1&&s[j+1]!=ss[i])
j=nex[j];
if(s[j+1]==ss[i])
j++;
if(j==lens-1)
{
j=-1;
ans++;
}
}
return ans;
}
int main()
{
char ss[1005],s[1005];
while(~scanf("%s",ss))
{
if(strcmp(ss,"#")==0) break;
scanf("%s",s);
int nex[1005];
int ans=kmp(ss,s,nex);
printf("%d\n",ans);
}
return 0;
}
迴圈節問題
什麼是迴圈節?
—->baidu
利用KMP演算法中的next值可以求出字串的迴圈節,如ababab的迴圈節為ab,abcd的迴圈節為abcd,具體做法如下:假設字串的長度為len,next[len]為字串的最後一個字元的下一個字元的next值(下標從0開始),如果len % (len - next[len]) == 0,那麼迴圈節的迴圈次數為len / (len - next[len]),否則為1
迴圈節問題
Power Strings (POJ2046)
Given two strings a and b we define a*b to be their concatenation. For example, if a = “abc” and b = “def” then a*b = “abcdef”. If we think of concatenation as multiplication, exponentiation by a non-negative integer is defined in the normal way: a^0 = “” (the empty string) and a^(n+1) = a*(a^n).
Input
Each test case is a line of input representing s, a string of printable characters. The length of s will be at least 1 and will not exceed 1 million characters. A line containing a period follows the last test case.
Output
For each s you should print the largest n such that s = a^n for some string a.
Sample Input
abcd
aaaa
ababab
.
Sample Output
1
4
3
Hint
This problem has huge input, use scanf instead of cin to avoid time limit exceed.
#include<iostream>
#include<cstdio>
#include<algorithm>
#include<cstring>
#include<set>
#include<map>
#include<cmath>
using namespace std;
char s[1000000+100];
int nex[1000000+100];
void get_nex(int nex[],char *s)
{
int len=strlen(s);
int i=0,j=-1;
nex[0]=-1;
while(i!=len)
{
if(j==-1||s[i]==s[j])
nex[++i]=++j;
else j=nex[j];
}
}
int main()
{
int T,i;
while(scanf("%s",s))
{
if(strcmp(s,".")==0) break;
int len=strlen(s);
get_nex(nex,s);
int t=len-nex[len];
if(len%t==0&&nex[len]!=0)
printf("%d\n",len/t);
else printf("%d\n",1);
}
return 0;
}
還有迴圈節的擴充套件問題
Cyclic Nacklace(HDU3746)
CC always becomes very depressed at the end of this month, he has checked his credit card yesterday, without any surprise, there are only 99.9 yuan left. he is too distressed and thinking about how to tide over the last days. Being inspired by the entrepreneurial spirit of “HDU CakeMan”, he wants to sell some little things to make money. Of course, this is not an easy task.
As Christmas is around the corner, Boys are busy in choosing christmas presents to send to their girlfriends. It is believed that chain bracelet is a good choice. However, Things are not always so simple, as is known to everyone, girl’s fond of the colorful decoration to make bracelet appears vivid and lively, meanwhile they want to display their mature side as college students. after CC understands the girls demands, he intends to sell the chain bracelet called CharmBracelet. The CharmBracelet is made up with colorful pearls to show girls’ lively, and the most important thing is that it must be connected by a cyclic chain which means the color of pearls are cyclic connected from the left to right. And the cyclic count must be more than one. If you connect the leftmost pearl and the rightmost pearl of such chain, you can make a CharmBracelet. Just like the pictrue below, this CharmBracelet’s cycle is 9 and its cyclic count is 2:
Now CC has brought in some ordinary bracelet chains, he wants to buy minimum number of pearls to make CharmBracelets so that he can save more money. but when remaking the bracelet, he can only add color pearls to the left end and right end of the chain, that is to say, adding to the middle is forbidden.
CC is satisfied with his ideas and ask you for help.
Input
The first line of the input is a single integer T ( 0 < T <= 100 ) which means the number of test cases.
Each test case contains only one line describe the original ordinary chain to be remade. Each character in the string stands for one pearl and there are 26 kinds of pearls being described by ‘a’ ~’z’ characters. The length of the string Len: ( 3 <= Len <= 100000 ).
Output
For each case, you are required to output the minimum count of pearls added to make a CharmBracelet.
Sample Input
3
aaa
abca
abcde
Sample Output
0
2
5
*這個題就是要求給出字串,,問還差幾位,能形成迴圈節
所以,直接上程式碼*
#include<iostream>
#include<stdio.h>
#include<cstring>
using namespace std;
char a[100001];
int nexts[100001];
int main()
{
int T,len,i,j,nima;
while(scanf("%d",&T)!=EOF)
{
getchar();
while(T--)
{
scanf("%s",a+1);
len=strlen(a+1);
i=1;j=0;nexts[1]=0;
while(i<=len)
{
if(j==0||a[i]==a[j])
{
i++;j++;
nexts[i]=j;
}
else
j=nexts[j];
}
nima=(len+1)-nexts[len+1];
if(len%nima==0&&len!=nima)//這裡要注意個是迴圈節的長度不能等於它自己本身就是len!=nima
printf("%d\n",0);
else
printf("%d\n",nima-len%nima);
}
}
return 0;
}
//別在意nima這個變數名
這次kmp先寫到這,講真的真不好理解。。。。。。可能我太菜 了吧