10008 - What's Cryptanalysis? - SamChien's blog-山姆謙的技術筆記

Cryptanalysis is the process of breaking someone else's cryptographic writing. This sometimes involves some kind of statistical analysis of a passage of (encrypted) text. Your task is to write a program which performs a simple analysis of a given text.

Input

The first line of input contains a single positive decimal integer n. This is the number of lines which follow in the input. The next n lines will contain zero or more characters (possibly including whitespace). This is the text which must be analyzed.

Output

Each line of output contains a single uppercase letter, followed by a single space, then followed by a positive decimal integer. The integer indicates how many times the corresponding letter appears in the input text. Upper and lower case letters in the input are to be considered the same. No other characters must be counted. The output must be sorted in descending count order; that is, the most frequent letter is on the first output line, and the last line of output indicates the least frequent letter. If two letters have the same frequency, then the letter which comes first in the alphabet must appear first in the output. If a letter does not appear in the text, then that letter must not appear in the output.

Sample Input

3
This is a test.
Count me 1 2 3 4 5.
Wow!!!!  Is this question easy?

Sample Output

S 7
T 6
I 5
E 4
O 3
A 2
H 2
N 2
U 2
W 2
C 1
M 1
Q 1
Y 1

出處: UVa Online Judge - What's Cryptanalysis?

問題敘述

此問題會先輸入一個整數 n，表示後面要輸入的文字行數。接著會輸入 n 行文字，您的任務是要在這些文字中統計出所有英文字母出現的次數 (忽略大小寫)，並且依據出現次數由多到少依序輸出，若有出現次數一樣的字母，則將字母順序比較前面的先輸出，例如: "A 2" 和 "H 2"，就要先輸出 "A 2" 然後再輸出 "H 2"。

解題思路

我的想法是先統計出每個字母出現的次數，這用 c++ STL 的 map 可以很容易做到。再把每對字母和次數綁成一個物件並且依據字母順序排好，最後利用穩定排序來依據次數大小做排序。穩定排序可以用重新定義 c++ STL algorithm 裡的 sort 的比較函數來做到，詳細請看程式碼。

c++ 程式碼

#include <iostream>
#include <map>
#include <vector>
#include <algorithm>
using namespace std;

class EngTimesPair {
    public:
        char eng;
        int times;
};

bool pairCmp(const EngTimesPair* thisPair, const EngTimesPair* otherPair) {
    if (thisPair->times > otherPair->times)
        return true;
    else if (thisPair->times == otherPair->times && thisPair->eng < otherPair->eng)
        return true;
    else
        return false;
}

int main() {
    int n;
    map<char, int> countTable;
    vector<EngTimesPair*> pairList;
    string inputLine;

    (cin >> n).get();
    for (int i=1; i<=n; i++) {
        getline(cin, inputLine);
        for (int j=0; j<inputLine.length(); j++) {
            if ((inputLine[j] >= 'A' && inputLine[j] <= 'Z') ||
                (inputLine[j] >= 'a' && inputLine[j] <= 'z')) {
                if (inputLine[j] >= 'a' && inputLine[j] <= 'z')
                    inputLine[j] -= 'a' - 'A';
                countTable[inputLine[j]]++;
            }
        }
    }
    for (int i='A'; i<'Z'; i++) {
        if (countTable[(char)i]) {
            EngTimesPair* pairs = new EngTimesPair();

            pairs->eng = (char)i;
            pairs->times = countTable[(char)i];
            pairList.push_back(pairs);
        }
    }
    sort(pairList.begin(), pairList.end(), pairCmp);
    for (int i=0; i<pairList.size(); i++)
        cout << pairList[i]->eng << " " << pairList[i]->times << endl;
    for (int i=0; i<pairList.size(); i++)
        delete pairList[i];

    return 0;
}

SamChien's blog-山姆謙的技術筆記

2013年4月5日星期五