1. 程式人生 > >第二十七篇 玩轉數據結構——集合(Set)與映射(Map)

第二十七篇 玩轉數據結構——集合(Set)與映射(Map)

exce ger 圖片 his remove @override 算法 ima 時間

1.. 集合的應用
  • 集合可以用來去重
  • 集合可以用於進行客戶的統計
  • 集合可以用於文本詞匯量的統計
2.. 集合的實現
  • 定義集合的接口
  • Set<E>
    ·void add(E)   // 不能添加重復元素
    ·void remove(E)
    ·boolean contains(E)
    ·int getSize()
    ·boolean isEmpty()

  • 集合接口的業務邏輯如下:

  • public interface Set<E> {
    
        void add(E e);
    
        void remove(E e);
    
        boolean
    contains(E e); int getSize(); boolean isEmpty(); }

  • 用二分搜索樹作為集合的底層實現
  • public class BSTSet<E extends Comparable<E>> implements Set<E> {
    
        private BST<E> bst;
    
        // 構造函數
        public BSTSet() {
            bst = new BST<>();
        }
    
        // 實現getSize方法
        @Override
        
    public int getSize() { return bst.size(); } // 實現isEmpty方法 @Override public boolean isEmpty() { return bst.isEmpty(); } // 實現contains方法 @Override public boolean contains(E e) { return bst.contains(e); } // 實現add方法 public void add(E e) { bst.add(e); }
    // 實現remove方法 public void remove(E e) { bst.remove(e); } }

  • 用鏈表作為集合的底層實現
  • public class LinkedListSet<E> implements Set<E> {
    
        private LinkedList<E> list;
    
        // 構造函數
        public LinkedListSet() {
            list = new LinkedList<>();
        }
    
        // 實現getSize方法
        @Override
        public int getSize() {
            return list.getSize();
        }
    
        // 實現isEmpty方法
        @Override
        public boolean isEmpty() {
            return list.isEmpty();
        }
    
        // 實現contains方法
        @Override
        public boolean contains(E e) {
            return list.contains(e);
        }
    
        // 實現add方法
        @Override
        public void add(E e) {
            if (!list.contains(e)) {
                list.addFirst(e);
            }
        }
    
        // 實現remove方法
        @Override
        public void remove(E e) {
            list.removeElement(e);
        }

  • 用二分搜索樹實現的集合與用鏈表實現的集合的性能比較
  • import java.util.ArrayList;
    
    public class Main {
    
        public static double testSet(Set<String> set, String filename) {
    
            long startTime = System.nanoTime();
    
            System.out.println(filename);
            ArrayList<String> words = new ArrayList<>();
            if (FileOperation.readFile(filename, words)) {
                System.out.println("Total words: " + words.size());
    
                for (String word : words) {
                    set.add(word);
                }
                System.out.println("Total different words: " + set.getSize());
            }
    
            long endTime = System.nanoTime();
    
            return (endTime - startTime) / 1000000000.0;
        }
    
        public static void main(String[] args) {
            String filename = "pride-and-prejudice.txt";
    
            BSTSet<String> bstSet = new BSTSet<>();
            double time1 = testSet(bstSet, filename);
            System.out.println("BSTSet, time: " + time1 + " s");
    
            System.out.println();
    
            LinkedListSet<String> linkedListSet = new LinkedListSet<>();
            double time2 = testSet(linkedListSet, filename);
            System.out.println("LinkedListSet, time: " + time2 + " s");
        }
    }

  • 輸出結果:
  • pride-and-prejudice.txt
    Total words: 125901
    Total different words: 6530
    BSTSet, time: 0.109504342 s
    
    pride-and-prejudice.txt
    Total words: 125901
    Total different words: 6530
    LinkedListSet, time: 2.208894105 s

  • 通過比較結果,我們發現,用二分搜索樹實現的集合的比用鏈表實現的集合更加高效

3.. 集合的時間復雜度分析

  • 技術分享圖片

  • 上圖中"h"是二分搜索樹的高度
  • 當二分搜索樹"滿"的時候,性能是最佳的,時間復雜度為O(logn);當二分搜索樹退化為鏈表的時候,性能是最差的,時間復雜度為O(n)
  • 技術分享圖片

4.. 映射(Map)
  • 映射是存儲(鍵,值)數據對的數據結構(Key, Value)
  • 根據鍵(Key),尋找值(Value)
5.. 映射的實現
  • 定義映射的接口
  • Map<K, V>
    ·void add(K, V)
    ·V remove(K)
    ·boolean contains(K)
    ·V get(K)
    ·void set(K, V)
    ·int getSize()
    ·boolean isEmpty()

  • 映射接口的業務邏輯如下
  • public interface Map<K, V> {
    
        void add(K key, V value);
    
        V remove(K key);
    
        boolean contains(K key);
    
        V get(K key);
    
        void set(K key, V value);
    
        int getSize();
    
        boolean isEmpty();
    }

  • 用鏈表作為映射的底層實現
  • public class LinkedListMap<K, V> implements Map<K, V> {
    
        private class Node {
            public K key;
            public V value;
            public Node next;
    
            public Node(K key, V value, Node next) {
                this.key = key;
                this.value = value;
                this.next = next;
            }
    
            public Node(K key) {
                this(key, null, null);
            }
    
            public Node() {
                this(null, null, null);
            }
    
            @Override
            public String toString() {
                return key.toString() + " : " + value.toString();
            }
        }
    
        private Node dummyHead;
        private int size;
    
        // 構造函數
        public LinkedListMap() {
            dummyHead = new Node();
            size = 0;
        }
    
        // 實現getSize方法
        @Override
        public int getSize() {
            return size;
        }
    
        // 實現isEmpty方法
        @Override
        public boolean isEmpty() {
            return size == 0;
        }
    
        private Node getNode(K key) {
            Node cur = dummyHead;
            while (cur != null) {
                if (cur.key.equals(key)) {
                    return cur;
                }
                cur = cur.next;
            }
            return null;
        }
    
        // 實現contains方法
        @Override
        public boolean contains(K key) {
            return getNode(key) != null;
        }
    
        // 實現get方法
        @Override
        public V get(K key) {
            Node node = getNode(key);
    
            // return node == null ? null : node.value;
            if (node != null) {
                return node.value;
            }
            return null;
        }
    
        // 實現add方法
        public void add(K key, V value) {
            Node node = getNode(key);
            if (node == null) {
                dummyHead.next = new Node(key, value, dummyHead.next);
                size++;
            } else {
                node.value = value;
            }
        }
    
        // 實現set方法
        public void set(K key, V newValue) {
            Node node = getNode(key);
            if (node == null) {
                throw new IllegalArgumentException(key + " doesn‘t exist.");
            } else {
                node.value = newValue;
            }
        }
    
        // 實現remove方法
        public V remove(K key) {
    
            Node node = getNode(key);
            if (node == null) {
                throw new IllegalArgumentException(key + " doesn‘t exist.");
            }
    
            Node prev = dummyHead;
            while (prev.next != null) {
                if (prev.next.key.equals(key)) {
                    break;
                }
                prev = prev.next;
            }
    
            if (prev.next != null) {
                Node delNode = prev.next;
                prev.next = delNode.next;
                delNode.next = null;
                size--;
                return delNode.value;
            }
            return null;
        }
    }

  • 用二分搜索樹作為映射的底層實現
  • public class BSTMap<K extends Comparable<K>, V> implements Map<K, V> {
    
        private class Node {
            private K key;
            private V value;
            private Node left;
            private Node right;
    
            // 構造函數
            public Node(K key, V value) {
                this.key = key;
                this.value = value;
                this.left = null;
                this.right = null;
            }
    
    //        public Node(K key) {
    //            this(key, null);
    //        }
        }
    
        private Node root;
        private int size;
    
        // 構造函數
        public BSTMap() {
            root = null;
            size = 0;
        }
    
        // 實現getSize方法
        @Override
        public int getSize() {
            return size;
        }
    
        // 實現isEmpty方法
        public boolean isEmpty() {
            return size == 0;
        }
    
        // 實現add方法
        @Override
        public void add(K key, V value) {
            root = add(root, key, value);
        }
    
        // 向以node為根節點的二分搜索樹中插入元素(key, value),遞歸算法
        // 返回插入新元素後的二分搜索樹的根
        private Node add(Node node, K key, V value) {
    
            if (node == null) {
                size++;
                return new Node(key, value);
            }
    
            if (key.compareTo(node.key) < 0) {
                node.left = add(node.left, key, value);
            } else if (key.compareTo(node.key) > 0) {
                node.right = add(node.right, key, value);
            } else {
                node.value = value;
            }
            return node;
        }
    
        // 返回以node為根節點的二分搜索樹中,key所在的節點
        private Node getNode(Node node, K key) {
    
            if (node == null)
                return null;
    
            if (key.compareTo(node.key) < 0) {
                return getNode(node.left, key);
            } else if (key.compareTo(node.key) > 0) {
                return getNode(node.right, key);
            } else {
                return node;
            }
        }
    
        @Override
        public boolean contains(K key) {
            return getNode(root, key) != null;
        }
    
        @Override
        public V get(K key) {
    
            Node node = getNode(root, key);
            return node == null ? null : node.value;
        }
    
        @Override
        public void set(K key, V newValue) {
            Node node = getNode(root, key);
            if (node == null)
                throw new IllegalArgumentException(key + " doesn‘t exist!");
    
            node.value = newValue;
        }
    
        // 返回以node為根的二分搜索樹的最小元素所在節點
        private Node minimum(Node node) {
            if (node.left == null) {
                return node;
            }
            return minimum(node.left);
        }
    
        // 刪除掉以node為根的二分搜索樹中的最小元素所在節點
        // 返回刪除節點後新的二分搜索樹的根
        private Node removeMin(Node node) {
            if (node.left == null) {
                Node rightNode = node.right;
                node.right = null;
                size--;
                return rightNode;
            }
            node.left = removeMin(node.left);
            return node;
        }
    
        // 實現remove方法
        // 刪除二分搜索樹中鍵為key的節點
        @Override
        public V remove(K key) {
            Node node = getNode(root, key);
    
            if (node != null) {
                root = remove(root, key);
                return node.value;
            }
            return null;
        }
    
        // 刪除以node為根節點的二分搜索樹中鍵為key的節點,遞歸算法
        // 返回刪除節點後新的二分搜索樹的根
        private Node remove(Node node, K key) {
            if (node == null) {
                return null;
            }
    
            if (key.compareTo(node.key) < 0) {
                node.left = remove(node.left, key);
                return node;
            } else if (key.compareTo(node.key) > 0) {
                node.right = remove(node.right, key);
                return node;
            } else {
                // 待刪除節點左子樹為空的情況
                if (node.left == null) {
                    Node rightNode = node.right;
                    node.right = null;
                    size--;
                    return rightNode;
                    // 待刪除節點右子樹為空的情況
                } else if (node.right == null) {
                    Node leftNode = node.left;
                    node.left = null;
                    size--;
                    return leftNode;
                    // 待刪除節點左右子樹均不為空
                    // 找到比待刪除節點大的最小節點,即待刪除節點右子樹的最小節點
                    // 用這個節點頂替待刪除節點
                } else {
                    Node successor = minimum(node.right);
                    successor.right = removeMin(node.right);  //這裏進行了size--操作
                    successor.left = node.left;
                    node.left = null;
                    node.right = null;
                    return successor;
                }
            }
        }
    }

  • 用二分搜索樹實現的映射與用鏈表實現的映射的性能比較
  • import java.util.ArrayList;
    
    public class Main {
    
        public static double testMap(Map<String, Integer> map, String filename) {
    
            long startTime = System.nanoTime();
    
            System.out.println(filename);
            ArrayList<String> words = new ArrayList<>();
            if (FileOperation.readFile(filename, words)) {
                System.out.println("Total words: " + words.size());
                for (String word : words) {
                    if (map.contains(word)) {
                        map.set(word, map.get(word) + 1);
                    } else {
                        map.add(word, 1);
                    }
                }
    
                System.out.println("Total different words: " + map.getSize());
                System.out.println("Frequency of PRIDE: " + map.get("pride"));
                System.out.println("Frequency of PREJUDICE: " + map.get("prejudice"));
            }
    
            long endTime = System.nanoTime();
    
            return (endTime - startTime) / 1000000000.0;
        }
    
        public static void main(String[] args) {
    
            String filename = "pride-and-prejudice.txt";
    
            LinkedListMap<String, Integer> linkedListMap = new LinkedListMap<>();
            double time1 = testMap(linkedListMap, filename);
            System.out.println("Linked List Map, time: " + time1 + " s");
    
            System.out.println();
            System.out.println();
    
            BSTMap<String, Integer> bstMap = new BSTMap<>();
            double time2 = testMap(bstMap, filename);
            System.out.println("BST Map, time: " + time2 + " s");
    
        }
    }

  • 輸出結果
  • pride-and-prejudice.txt
    Total words: 125901
    Total different words: 6530
    Frequency of PRIDE: 53
    Frequency of PREJUDICE: 11
    Linked List Map, time: 9.692566895 s
    
    
    pride-and-prejudice.txt
    Total words: 125901
    Total different words: 6530
    Frequency of PRIDE: 53
    Frequency of PREJUDICE: 11
    BST Map, time: 0.085364242 s

  • 通過比較結果,我們發現,用二分搜索樹實現的映射的比用鏈表實現的映射更加高效

6.. 映射的時間復雜度

  • 技術分享圖片

  • 上圖中"h"是二分搜索樹的高度
  • 當二分搜索樹"滿"的時候,性能是最佳的,時間復雜度為O(logn);當二分搜索樹退化為鏈表的時候,性能是最差的,時間復雜度為O(n)

第二十七篇 玩轉數據結構——集合(Set)與映射(Map)