1. 程式人生 > >C#LeetCode刷題之#819-最常見的單詞(Most Common Word)

C#LeetCode刷題之#819-最常見的單詞(Most Common Word)

問題

給定一個段落 (paragraph) 和一個禁用單詞列表 (banned)。返回出現次數最多,同時不在禁用列表中的單詞。題目保證至少有一個詞不在禁用列表中,而且答案唯一。

禁用列表中的單詞用小寫字母表示,不含標點符號。段落中的單詞不區分大小寫。答案都是小寫字母。

輸入: paragraph = "Bob hit a ball, the hit BALL flew far after it was hit." banned = ["hit"]

輸出: "ball"

解釋: "hit" 出現了3次,但它是一個禁用的單詞。"ball" 出現了2次 (同時沒有其他單詞出現2次),所以它是段落裡出現次數最多的,且不在禁用列表中的單詞。 注意,所有這些單詞在段落裡不區分大小寫,標點符號需要忽略(即使是緊挨著單詞也忽略, 比如 "ball,"), "hit"不是最終的答案,雖然它出現次數更多,但它在禁用單詞列表中。

說明:

  • 1 <= 段落長度 <= 1000.
  • 1 <= 禁用單詞個數 <= 100.
  • 1 <= 禁用單詞長度 <= 10.
  • 答案是唯一的, 且都是小寫字母 (即使在 paragraph 裡是大寫的,即使是一些特定的名詞,答案都是小寫的。)
  • paragraph 只包含字母、空格和下列標點符號!?',;.
  • 不存在沒有連字元或者帶有連字元的單詞。
  • 單詞裡只包含字母,不會出現省略號或者其他標點符號。

Given a paragraph and a list of banned words, return the most frequent word that is not in the list of banned words.  It is guaranteed there is at least one word that isn't banned, and that the answer is unique.

Words in the list of banned words are given in lowercase, and free of punctuation.  Words in the paragraph are not case sensitive.  The answer is in lowercase.

Input: paragraph = "Bob hit a ball, the hit BALL flew far after it was hit." banned = ["hit"]

Output: "ball"

Explanation: "hit" occurs 3 times, but it is a banned word."ball" occurs twice (and no other word does), so it is the most frequent non-banned word in the paragraph. Note that words in the paragraph are not case sensitive,that punctuation is ignored (even if adjacent to words, such as "ball,"), and that "hit" isn't the answer even though it occurs more because it is banned.

Note:

  • 1 <= paragraph.length <= 1000.
  • 1 <= banned.length <= 100.
  • 1 <= banned[i].length <= 10.
  • The answer is unique, and written in lowercase (even if its occurrences in paragraph may have uppercase symbols, and even if it is a proper noun.)
  • paragraph only consists of letters, spaces, or the punctuation symbols !?',;.
  • There are no hyphens or hyphenated words.
  • Words only consist of letters, never apostrophes or other punctuation symbols.

示例

public class Program {

    public static void Main(string[] args) {
        var paragraph = "Bob. hIt, baLl";
        var banned = new string[] { "bob", "hit" };

        var res = MostCommonWord(paragraph, banned);
        Console.WriteLine(res);

        Console.ReadKey();
    }

    private static string MostCommonWord(string paragraph, string[] banned) {
        //轉小寫後,過濾非字元
        //也可按題目給定的 !? ',;. 為非字元進行判定
        var sb = new StringBuilder(paragraph.ToLower());
        for(var i = 0; i < sb.Length; i++) {
            if(!(sb[i] >= 'a' && sb[i] <= 'z') && !(sb[i] >= 'A' && sb[i] <= 'Z')) {
                sb[i] = ' ';
            }
        }
        //用字典統計次數
        var dic = new Dictionary<string, int>();
        var split = sb.ToString().Split(' '/*, StringSplitOptions.RemoveEmptyEntries*/);
        foreach(var word in split) {
            //過濾空值和ban列表中存在的值
            if(word.Trim() == "") continue;
            if(!banned.Contains(word)) {
                if(dic.ContainsKey(word)) {
                    dic[word]++;
                } else {
                    dic[word] = 1;
                }
            }
        }
        //輸出最大值
        return dic.OrderByDescending(d => d.Value).ToList()[0].Key;
    }

}

以上給出1種演算法實現,以下是這個案例的輸出結果:

ball

分析:

顯而易見,以上演算法的時間複雜度為: O(n) 。