【龍書答案】第三章解析(未完成)
Exercise 3.3
Problem 3.3.1
Consult the language reference manuals to determine
The sets of characters that form the input alphabet (exclude those may only appear in character strings or comments).
The lexical form of numerical constants and
The lexical form of identifiers.
for each of the following languages:
C
C++
C#
Fortran
Java
Lisp
SQL
Answer:
Actually this exercise is not that important. Since it will cost much time to search language for each language, which is not worth to do. So we will skip the question first and see whether we have extra time to fill it.
Problem 3.3.2
Describe the languages denoted by the following regular expressions:
a (a | b)* a
((ε|a)b*)*
(a|b)*a(a|b)(a|b)
a*ba*ba*ba*
(aa|bb)((ab|ba)(aa|bb)(ab|ba)(aa|bb))
Answer:
String made up by a’s and b’s and end with a.
Note that this is an arbitrary string make up by a’s and b’s. This is a little bit tricky since I firstly think it cannot form string arbitrarily due to the constraint of a.
This is string made up by a’s and b’s and the third character from last must be a.
String of a’s and b’s but with exactly three b.
String of a’s and b’s that has even number of a and b.
Problem 3.3.3
In a string of length n, how many of the following are there?
Prefixes
Suffixes
Proper prefixes
Substrings
Subsequences
Answer:
Clearly n+1.
Clearly n+1
n-1. Except the empty string and that string itself.
(n+1)n/2 + 1. Just enumerate substrings with length 1,2,3…n. And still need to count the empty string.
There are totally 2^n subsequences, which is a permutation problem.
Problem 3.3.4
Most languages are case sensitive, so keywords can be written only one way, and the regular expressions describing their lexeme is very simple. However, some languages, like SQL, are case insensitive, so a keyword can be written either in lowercase or in uppercase, or in any mixture of cases. Thus, the SQL keyword SELECT can also be written select, Select, or sElEcT, for instance. Show how to write a regular expression for a keyword in a case insensitive language. Illustrate the idea by writing the expression for “select” in SQL.
Answer:
select
→ [Ss][Ee][Ll][Cc][Ee][Tt]
Problem 3.3.5
Write regular definitions for the following languages:
All strings of lowercase letters that contain the five vowels in order.
All strings of lowercase letters in which the letters are in ascending lexicographic order.
Comments, consisting of a string surrounded by /* and /, without an intervening /, unless it is inside double-quotes (“)
All strings of digits with no repeated digits.
Hint: Try this problem first with a few digits, such as {0, 1, 2}.All strings of digits with at most one repeated digit.
All strings of a’s and b’s with an even number of a’s and an odd number of b’s.
The set of Chess moves,in the informal notation,such as p-k4 or kbp*qn.
All strings of a’s and b’s that do not contain the substring abb.
All strings of a’s and b’s that do not contain the subsequence abb.
Answer:
1.
other
→ [bcdfghjklmnpqrstvwxyz]
res→ (other)* a (other | a)* e (other | e)* i (other | i)* o (other | o)* u (other | u)*
注意這裡默認了e出現之後的位置不能夠再出現a了。原則上符合按順序的母音字母。
2.
a* b* c*
⋯ z*
這個就是簡單的列舉一下。
3.
\ / \ * ( [ ^ * ” ] * | ” . * ” | \ * + [ ^ / ] ) * \ * \ /
這個需要解釋一下了,[ ^ * ” ] *:除了 * 和 ” 之外所有的符號任意長度的串。
4.5.6.7.
我看了這個答案,要用到狀態轉換圖+狀態圖簡化的一些技巧,我現在還不能看懂,等再過幾天看了之後的章節一定會把這裡的東西補上。
8.
b*(a+b?)*
終於看到了一個好理解的東西了,首先開頭可以出現很多b,然後一旦a出現,就只能有a或者ab了,這就是後面那個閉包的概念。
9.
b* | b*a+ | b*a+ba*
因為其實滿足這個條件的字串不是很多,只有上面三種,所以完全可以列舉達到最終的結果。
Problem 3.3.6
Write character classes for the following sets of characters:
The first ten letters (up to “j”) in either upper or lower case.
The lowercase consonants.
The “digits” in a hexadecimal number (choose either upper or lower case for the “digits” above 9).
The characters that can appear at the end of alegitimate English sentence (e.g. , exclamation point) .
Answer:
1.
[A-Ja-j]
2.
[bcdfghjklmnpqrstvwxzy]
3.
[0-9a-f]
4.
[.?!]
Problem 3.3.7
Note that these regular expressions give all of the following symbols (operator characters) a special meaning:
\ ” . ^ $ [ ] * + ? { } | /
Their special meaning must be turned off if they are needed to represent themselves in a character string. We can do so by quoting the character within a string of length one or more; e.g., the regular expression “**” matches the string ** . We can also get the literal meaning of an operator character by preceding it by a backslash. Thus, the regular expression ** also matches the string **. Write a regular expression that matches the string “\.
Answer:
\”\\
這個很簡單的,就是每一個符號前面加一個反斜槓。
Problem 3.3.8-12
To be written.