省选前模板复习 | Rye_Catcher - Stay Young,Stay Simple

PREFACE

也许是OI生涯最后一场正式比赛了，说是省选前模板，其实都是非常基础的东西，穿插了英文介绍和部分代码实现

祝各位参加JXOI2019的都加油吧

也希望今年JX能翻身,在国赛中夺金

Some Words:

property n. [C] (数学)性质

数学知识

见数学知识小结

字符串

KMP算法Knuth-Morris-Pratt Algorithm

KMP算法,又称模式匹配算法,是用来在一个文本串(text string)s中找到所有模式串(pattern)w出现的位置.

它是通过当失配(mismatch)发生时,模式串本身能提供足够的信息来决定下一个匹配能从哪里开始,这样就可以避开对前面已经匹配好的字符的再次遍历检查(即暴力做法)

“In computer science, the Knuth–Morris–Pratt string-searching algorithm (or KMP algorithm) searches for occurrences of a “word” W within a main “text string” S by employing the observation that when a mismatch occurs, the word itself embodies sufficient information to determine where the next match could begin, thus bypassing re-examination of previously matched characters.”

来源: 维基百科

我们需要构造一个函数next,next[n]表示的是模式串w中以第n个字符结尾的非前缀子串(substring)与w的前缀(prefix)能够匹配的最大长度

具体实现过程与思路见OI WIKI

简单说一下比较难懂的在w[i]!=w[j+1],为什么指针j跳向next[j]

显然w[1~j]是等于i-1前j个字母的字串的,但是我们比较发现w[j+1]!=w[i]

我们想要的新的j是使得除了第$j+1$个字符需要与w[i]比较外,w[1]~w[j]的前缀等于i-1(包括i-1)前j个字母的后缀(suffix)

又由于i-1前任意k(k<=next[j])个字符后缀等于next[j]前任意k个字母的后缀

又因为w[1~next[next[j]]]等于next[j]前next[next[j]]长度的后缀,即等于i-1前next[next[j]]长度的后缀

所以新的j=next[j]就满足上述条件

n=strlen(a+1);m=strlen(b+1);//a is the text string
ne[1]=0;//next
for(ri i=2,j=0;i<=m;i++){
     while(j>0&&b[i]!=b[j+1])j=ne[j];
     if(b[i]==b[j+1])j++;
     ne[i]=j;
}
for(ri i=1,j=0;i<=n;i++){
     while(j>0&&a[i]!=b[j+1])j=ne[j];
     if(a[i]==b[j+1])j++;
     fail[i]=j;
     if(fail[i]==m){
     printf("%d\n",i-m+1);
     fail[i]=1;
     j=ne[j];
    }
}

哈希Hash Function

不想中文介绍了QAQ,OI中常用于数据的离散化(Discretization)

关于更多离散化内容

但是哈希函数有可能发生哈希冲突(Hash Collision),为了避免这种情况,我们可以采取双哈希或是拉链

“A hash function is any function) that can be used to map data) of arbitrary size onto data of a fixed size. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. Hash functions are often used in combination with a hash table, a common data structure used in computer software for rapid data lookup. Hash functions accelerate table or database lookup by detecting duplicated records in a large file. One such application is finding similar stretches in DNA sequences. They are also useful in cryptography.”

来源:维基百科

字符串哈希我一般用Rabin-Karp Hash,更多相关内容在链接里

#define ull unsigned long long 
ull x,hash_table[maxn];
char str[maxn];
void make_hash(int n){
    x=0;
    for(ri i=0;str[i];i++){
        x=x*131+a[i]-31;
        hash_table[i]=x;
    }
    return ;
}

对于两个串,我们可以将它们按上述方法映射(mapping)到哈希表后通过二分(binary serach)或是倍增(QAQ这个英语怎么说啊)来快速匹配

主要根据这个可以$O(1)$地判断两个字串是否相等

$w_{str_{i,j}}$

$=($ $a_i$ $p^{j-i}$+$a_{i+1}$ $p^{j-i-1}$+…+$a_{j}$ $p^0$)

$=$ $w_{pre_{j}}$ $-$ $w_{pre_{i-1}}$ $p^{j-i+1}$

数据结构Data Structure

栈Stack

栈是只有一端能进出元素的线性数据结构,所以它是后进后出(LIFO,Last In First Out)的

有入栈(push)与出栈(pop)两种基本操作

“In computer science, a stack is an abstract data type that serves as a collection) of elements, with two principal operations:

push, which adds an element to the collection, and
pop, which removes the most recently added element that was not yet removed.

The order in which elements come off a stack gives rise to its alternative name, LIFO (last in, first out). Additionally, a peek) operation may give access to the top without modifying the stack.[1]#cite_note-1)The name “stack” for this type of structure comes from the analogy to a set of physical items stacked on top of each other, which makes it easy to take an item off the top of the stack, while getting to an item deeper in the stack may require taking off multiple other items first.[2]#cite_note-clrs-2)”

来源:维基百科)

队列Queue

队列是一种”先进先出(FIFO,First In Fitst Out)”的线性数据结构.一般元素从右端入队,从左端出队.为了节省空间实现上可以采用循环队列

链表Linked List

(这几个都感觉没什么好说的)

讲一下邻接表(Adjacency List)吧,它可以看成是”带有索引数组的多个数据链表”,这种所引就是链表的表头,表头又构成了一个表头(headers)数组

存图常用邻接表或是邻接矩阵(Adjacency Matrix)

邻接表

int num_edge=1;
struct Edge{
    int ne,to,dis;
}edge[maxm];
void add_edge(int f,int to,int dis){
    edge[++num_edge].ne=h[f];
    edge[num_edge].to=to;
    edge[num_edge].dis=dis;
    h[f]=num_edge;
    return;
}

字典树Trie

字典树是一种用于实现字符串快速检索的多叉树结构，其节点都拥有字符指针.

“Trie is an efficient information reTrieval data structure.Using trie,search complexities can be brought too optimal limit(O(M),M is key length).However the penalty is on Trie storage requirements.”

int trie[maxn][27],tot=1;
bool end[maxn][27];
void Insert(){
    int t=tot,x;
    for(ri i=1;i<=len;i++){
        x=str[i]-'a';
        if(trie[t][x]==0)trie[t][x]=++tot;
        t=trie[t][x];
    }
    end[t][x]=1;
  	return ;
}
bool Match_Ok(){
    int t=tot,x;
    for(ri i=1;i<=len;i++){
        x=str[i]-'a';
        t=trie[t][x];
        if(!t)return false;
    }
    return end[t][x];
}

二叉堆Binary Heap

二叉堆是一颗满足堆性质的完全二叉树.根据排序方式可以分为大根堆(Max-Heap)和小根堆(Min-Heap)

对于大根堆,任意一个非根节点的权值都小于等于其父节点的权值;小根堆则反之

“A binary heap is a complete binary tree which satisfies the heap ordering property.The ordering can be one of two types:the min-heap property and the max-heap property.The value of each node of the max-heap is less than or equal to the value of its parent,with the maximum-value element at the root”

来源:cs.cmu.edu

int heap[maxn],n;//max-heap
void up_modify(int p){
    while(p>1){
        if(heap[p]>heap[p>>1]){
            swap(heap[p],heap[p>>1]);
            p=p>>1;
        }
        else break;
    }
    return ;
}
void insert(int val){
    heap[++n]=val;
    up(n);
}
void down(int p){
    int s=p<<1;
    while(s<=n){
        if(s<n&&heap[s]<heap[s+1])s++;//choost the maximum one of the two sons
        if(heap[s]>heap[p]){
            swap(heap[s],heap[p]);
            p=s,s=s<<1;
        }
        else break;
    }
    return;
}
void extract_top(){
    heap[1]=heap[n--];
    down(1);
    return ;
}

线段树Segment Tree

咕

图论Graph Theory

Dijkstra算法Dijkstra Algorithm

DIjkstra算法是为了解决单源最短路径问题(Single Source Shortest Path,SSSP)的，简单来说，就是求原点到其它所有点(vertex pl.vertices)的最短路(shortest path)

它是基于一个贪心思想,要求所有边都是非负的.大致就是不断拓展一颗最短路生成树

“Dijkstra’s algorithm is very similar to Prim’s algorithm for minimum spanning tree(MST).Like Prim’s MST,we generate a SPT(shortest path tree) with given source as root.We maintain two sets,one set contain vertices included in the shortest path tre,other set includes vertices not yet included in shortest path tree.At every step,we find a vertex which is in the other set and has a mimnum distance from the source.”

来源:geeksforgeeks.org

#define ri register int 
struct Vertex{
    int id,d;
    bool operator <(const Vertex & a)const{
        return d>a.d;
    }
    Vertex(){;}
    Vertex(int x,int y){id=x,d=y;}
};
priority_queue <Vertex> q;
void add_edge(int f,int to,int dis){
    edge[++num_edge].ne=h[f];
    edge[num_edge].to=to;
    edge[num_edge].dis=dis;
    h[f]=num_edge;
}
int dis[maxn];
bool vis[maxn];
void dijkstra(int s){
    int u,v,w;
    for(ri i=1;i<=n;i++){
        dis[i]=INF,vis[i]=0;
    }
    dis[s]=0;
    q.push(Vertex(s,0));
    while(q.size()){
        u=q.top().id,w=q.top().d;
        q.pop();
        if(vis[u])continue;
        for(ri i=h[u];i;i=edge[i].ne){
            v=edge[i].to;
            if(dis[v]>dis[u]+w){
                dis[v]=dis[u]+w;
                q.push(Vertex(v,dis[v]))
            }
        }
        vis[u]=1;
    }
    for(ri i=1;i<=n;i++)printf("%d\n",dis[i]);
    return ;
}

Bellman-Ford算法和SPFA算法Bellman-Ford Algorithm And SPFA

忽然发现ddl干不完了,因此介绍咕了

queue <int> q;
void spfa(){
    int u,v,w;
    for(int i=1;i<=n;i++)dis[i]=INF,vis[i]=0;
    dis[s]=0,vis[s]=1;
    q.push(s);
    while(q.size()){
        u=q.top();q.pop();
        vis[u]=0;
        for(int i=h[u];i;i=edge[i].ne){
            v=edge[i].to,w=edge[i].dis;
            if(dis[v]>dis[u]+w){
                dis[v]=dis[u]+w;
                if(!dis[v]){q.push(v);dis[v]=1;}
            }
        }
    }
    for(int i=1;i<=n;i++)printf("%d\n",dis[i]);
    return ;
}

Floyd算法Floyd Warshall Algorithm

Floyd算法是基于动态规划思想来解决任意两点之间最短路径(All Pairs Shortest Path problem,APSP)的算法

亦可用于可传递的关系(传递闭包),这个也先挖坑吧

memset(d,0x3f,sizeof(d));
for(int i=1;i<=n;i++)dis[i][i]=0;
for(int i=1;i<=m;i++){
    scanf("%d %d %d",&x,&y,&z);
    dis[x][y]=min(dis[x][y],z);
}
for(int k=1;k<=n;k++){
    for(int i=1;i<=n;i++){
        for(int j=1;j<=n;j++){
            dis[i][j]=min(dis[i][j],dis[i][k]+dis[k][j]);
        }
    }
}

Kruskal算法Kruskal Algorithm

Kruskal算法是求出一张给定联通边带权无向图的最小生成树算法.那么什么是最小生成树?

“Given a connected and undirected graph, a spanning tree of that graph is a subgraph that is a tree and connects all the vertices together. A single graph can have many different spanning trees. A minimum spanning tree (MST) or minimum weight spanning tree for a weighted, connected and undirected graph is a spanning tree with weight less than or equal to the weight of every other spanning tree. The weight of a spanning tree is the sum of weights given to each edge of the spanning tree.”

来源:Geeksforgeeks 强烈安利这个英文网站,写的浅显易懂而且最重要的是,it is not blocked

Kruskal算法维护无向图的最小生成森林,每次在连接两个森林的边中找一条权值最小的将两个森林连起来.对于连通性我们使用并查集维护

struct Edge{
    int u,v,dis;
    bool operator <(const Edge & a)const {
        return dis<a.dis;
    }
}edge[maxm];
int fa[maxn];
int get(int x){
    if(fa[x]!=x)fa[x]=get(fa[x]);
    return fa[x];
}
void kruskal(){
    int u,v,w,cnt=0;
    for(int i=1;i<=m;i++){
    	scanf("%d %d %d",&edge[i].u,&edge[i].v,&edge[i].dis);
	}
    sort(edge+1,edge+1+m);
    for(int i=1;i<=m;i++){
    	u=edge[i].u,v=edge[i].v;
        u=get(u),v=get(v);
        if(u!=v){
            fa[u]=v;
            cnt++;
        }
        if(cnt==n-1){
            break;
        }
	}
    return ;
}

树链剖分Heavy Light Decomposition

树链剖分是用来解决一类树上问题的,可以将树上的操作转化为序列上的操作,同时使用数据结构维护可以使其复杂度更优,根据链剖方式可分为重链剖分和长链剖分等,这里介绍重链剖分

由于重链数量级是$log N$的,线段树时间复杂度是$log N$,总时间复杂度是$log^2 N$

还可以同时求LCA,但是操作时有许多细节需要注意,尤其是dfn[]和rnk[]

找它的英文找了好久终于找到了,就是Heavy Light Decomposition

“Suppose we have an unbalanced tree (not necessarily a Binary Tree) of n nodes, and we have to perform operations on the tree to answer a number of queries, each can be of one of the two types:

change(a, b): Update weight of the ath edge to b.
maxEdge(a, b): Print the maximum edge weight on the path from node a to node b. For example maxEdge(5, 10) should print 25.

“

来源:geeksforgeeks

但是感觉geeksforgeeks上的介绍写得并不是很好…就不贴了.写一点自己口胡的工地英文

To perform two operations in a more efficient time complexity,we can first find out the heavy son of all nodes.We call a node is heavy only when its size(the number of nodes in its subtree) is larger than any of its siblings.We can make it by one DFS. Then we connects all the heavy-son node and we get several heavy-son chains and light-son chains.And if we want to perform two operations on the path of two nodes,we can perform operations on the segment tree built from the heavy-son chains along the path until we reach LCA.

Since it takes O($logN$) to perform operations on segment tree,apparently the complexity depends on the number of light-son chains.We can prove that it is no more than $log N$ along the path from one node to the tree root.,because when we meet one light-son chain,the number of nodes in his father’s subtree is reduced by at least half.

int fa[maxn],dfn[maxn],rnk[maxn]; 
int w[maxn],dep[maxn],top[maxn],son[maxn],size[maxn];
int cnt=0;
void dfs_1(int now){
    int v;size[now]=1;
    for(int i=h[now];i;i=edge[i].to){
        v=edge[i].to;
        if(v==fa[now])continue;
        fa[v]=now,dep[v]=dep[now]+1;
        dfs_1(v);
        size[now]+=size[v];
        if(!son[now]||size[v]>size[son[now]])son[now]=v;
    }
    return ;
}
void dfs_2(int now,int t){
    int v;
    top[now]=t,dfn[now]=++cnt,rnk[cnt]=now;
    if(!son[now])return ;
    dfs_2(son[now],t);
    for(int i=h[now];i;i=edge[i].ne){
        v=edge[i].to;
        if(v==son[now]||v==fa[now])continue;
        dfs_2(v,v);
    }
    return ;
}
int L,R;
void operation_on_path(int x,int y){
    while(top[x]!=top[y]){
        if(dep[top[x]]<dep[top[y]])swap(x,y);
        L=dfn[top[x]],R=dfn[x];
        operation_on_segmenttree(L,R);
        x=fa[top[x]];
    }
    if(dfn[x]>dfn[y])swap(x,y);//x is the LCA
    L=dfn[x],R=dfn[y];
    operation_on_segmenttree(L,R);
    return;
}
void operation_on_subtree(int x){
    L=dfn[x],R=dfn[x]+size[x]-1;
    operation_on_segmenttree(L,R);
    return ;
}

动态规划Dynamic Programming

估计要咕了

问题具有以下两个性质时,可以用动态规划

1) Overlapping Subproblems重叠子问题

2) Optimal Substructure最优子结构