Fuzzy Google Suggest

Google Suggest is a typical feature of the Google search box. It guesses what you’re typing and offers suggestions in real time (Fig 1). Sometimes, we may mistype the keyword. For example, when we want to find the information of “mcgrady” (a famouse NBA star), we mistyped it to “macgrady” (Fig 2). The clever Google can still give you the right suggestions. It is amazing, isn’t it? Fig 1: Google Suggest Fig 2: Fuzzy Google Suggest In this problem, your job is to implement a similar system. You will be given a data set including a large amount of English words. Your system should provide the search service. For each input query word, find the prefix-fuzzy-match words from the data set and ouput the number of those words. In the following, I will explain you the prefix fuzzy match. When we say the two words are fuzzy-match (without “prefix”), we mean that they look similar and their Edit Distance meets a given threshold (If you have no idea of the Edit Distance, please refer to the Hints below). When we say the word A is prefix-fuzzy-match (with “prefix”) to the word B, we mean that there exists a prefix p of word B, and the Edit Distance between p and B meets a given threshold. For Example, the data set contains four English words: “content”, “common”, “onganize”, “case”. The input query contains a word “con”

2/2 and a Edit Distance threshold 1. “con” is prefix-fuzzy-match to the words “content”, “common”, “onganize”, but “case” doesn’t meet the condition, because the minimun Edit Distance between all of its prefixes and “con” is 2 which is larger than the given threshold 1. Input The input file consists of two parts. The first part is a data set with n (1 ≤ n ≤ 300, 000) English words. The second part is the m (1 ≤ m ≤ 300) input queries. Each query has a word and an Edit Distance threshold edth (0 ≤ edth ≤ 2), separated by a space. The words in the data set and the queries contain at most 10 small letters (a-z). Output For each input query, output the number of words from data set that the query word is prefix-fuzzy- match to. Hints(From Wikipedia): The Edit Distance between two strings is given by the minimum number of operations needed to transform one string into the other, where an operation is an insertion, deletion, or substitution of a single character. For example, the Edit Distance between “kitten” and “sitting” is 3, since the following three edits change one into the other, and there is no way to do it with fewer than three edits: •kitten → sitten (substitution of ‘s’ for ‘k’) •sitten → sittin (substitution of ‘i’ for ‘e’) •sittin → sitting (insert ‘g’ at the end). Sample Input 4 content common onganize case 7 c0 con 0 con 2 con 1 com 1 comm 2 cog 1 Sample Output 3 1 4 3 2 2 2