1. 程式人生 > >SVD與 LSI教程(3): 計算矩陣的全部奇異值

SVD與 LSI教程(3): 計算矩陣的全部奇異值

/**********************作者資訊****************/

Dr. E. Garcia

Mi Islita.com

Email | Last Update: 01/07/07

/**********************作者資訊****************/

Revisiting Singular Values

In Part 2 of this tutorial you have learned that SVD decomposes a regular matrix A into three matrices

Equation 1: A = USVT

S

 was computed by the following procedure:

  1. AT and ATA were computed.
  2. the eigenvalues of ATA were determined and sorted in descending order, in the absolute sense. The nonnegative square roots of these are the singular values of A.
  3. S was constructed by placing singular values in descending order along its diagonal.

You learned that the Rank of a Matrix is the number of nonzero singular values.

You also learned that since S is a diagonal matrix, its nondiagonal elements are equal to zero. This can be verified by computing S fromUTAV. However, one would need to know U first, which we have not defined yet. Either way, the alternate expression for S

 is obtained by postmultipliying by V and premultiplying by UT Equation 1:

Equation 2: AV = USVTV = US

Equation 3: UTAV = S

This implies that U and V are orthogonal matrices. As discussed in Matrix Tutorial 2: Basic Matrix Operations, if a matrix M is orthogonal then

Equation 4: MMT = MTM = I = 1

where I is the identity matrix. But, we know that MM-1 = I = 1. Consequently, MT = M-1.

Computing "right" eigenvectors, V, and VT

In the example given in Part 2 you learned that the eigenvalues of AAT and ATA are identical since both respond to the same characteristic equation; e.g.

Characteristic equation and eigenvalues

Figure 1. Characteristic equation and eigenvalues for AAT and ATA.


Let's use these eigenvalues to compute the eigenvectors of ATA. This is done by solving

Equation 5: (A - ciI)Xi = 0

As mentioned in Matrix Tutorial 3: Eigenvalues and Eigenvectors for large matrices one would need to resource to the Power Method or other methods to do this. Fortunately in this case we are dealing with a small matrix, so we only need to use simple algebra.

We first compute eigenvectors for each eigenvalue, c1 = 40 and c2 = 10. Once computed, we convert eigenvectors to unit vectors. This is done by normalizing their lengths. Figure 2 illustrates these steps.

right-eigenvectors

Figure 2. Right eigenvectors of ATA


We would have arrived at identical results if during normalization we assumed an arbitrary coordinate value for either x1 or x2. We now construct V by placing vectors along its columns and compute VT

right-matrix

Figure 3. V and its transpose VT


Hey! That wasn't that hard.

Note that we constructed V by preserving the order in which singular values were placed along the diagonal of S. That is, we placed the largest eigenvector in the first column and the second eigenvector in the second column. These end paired with singular values placed along the diagonal of S. Preserving the order in which singular values, eigenvalues and eigenvectors are placed in their corresponding matrices is very important. Otherwise we end with the wrong SVD.

Let's compute now the "left" eigenvectors and U.

Computing "left" eigenvectors and U

To compute U we can reuse eigenvalues and compute in exactly the same manner the eigenvectors of AAT. Once these are computed we place these along the columns of U. However, with large matrices this is time consuming. In fact, one would need to compute eigenvectors by resourcing again to the Power Method or other suitable methods.

In practice, it is common to use the following shortcut. Postmultiply Equation 2 by S-1 to obtain

Equation 6: AVS-1 = USS-1

Equation 7: U = AVS-1

and then compute U. Since A and V are known already, we just need to invert S. Since S is a diagonal matrix, it must follows that

Inverted Singular Matrix

Figure 4. Inverted Singular Matrix.


Since s1 = 401/2 = 6.3245 and s2 = 101/2 = 3.1622 (expressed to four decimal places), then

Left eigenvectors and U

Figure 5. "Left" eigenvectors and U.


That was quite a mechanical task. Huh?

This shortcut is very popular since simplifies calculations. Unfortunately, its widespread use has resulted in many overlooking important information contained in the AATmatrix. In recent years, LSI researchers has found that high-order term-term co-occurrence patterns contained in this matrix might be important. At least two studies (1, 2), one a 2005 thesis, indicate that high-order term-term co-occurrence present in this matrix might be at the heart of LSI.

These studies are:

In the first issue of our IR Watch - The Newsletter -which is free- our subscribers learned about this thesis and other equally interesting LSI resources.

The orthogonal nature of the V and U matrices is evident by inspecting their eigenvectors. This can be demonstrated by computing dot products between column vectors. All dot products are equal to zero. A visual inspection is also possible in this case. In Figure 6 we have plotted eigenvectors. Observe that they are all orthogonal and end to the right and left of each other, from here the reference to these as "right" and "left" eigenvectors.

Right and Left Eigenvectors

Figure 6. "Right" and "Left" Eigenvectors.


Computing the Full SVD

So, we finally know U,S, V and VT. To complete the proof, we reconstruct A by computing its full SVD.

Full SVD

Figure 7. Computing the full SVD.


So as we can see, SVD is a straightforward matrix decomposition and reconstruction technique.

The Reduced SVD

Obtaining an approximation of the original matrix is quite easy. This is done by truncating the three matrices obtained from a full SVD. Essentially we keep the first k columns of U, the first k rows of VT and the first k rows and columns of S; that is, the first k singular values. This removes noisy dimensions and exposes the effect of the largest ksingular values on the original data. This effect is hidden, masked, latent, in the full SVD.

The reduction process is illustrated in Figure 8 and is often referred to as "computing the reduced SVD", dimensionality reduction or the Rank k Approximation.

Reduced SVD

Figure 8. The Reduced SVD or Rank k Approximation.


The shaded areas indicate the part of the matrices retained. The approximated matrix Ak is the Rank k Approximation of the original matrix and is defined as

Equation 8: Ak = UkSkVTk

So, once these matrices are approximated we simply compute their products to get Ak.

Quite easy. Huh?

Summary

So far we have learned that the full SVD of a matrix A can be computed by the following procedure:

  1. compute its transpose AT and ATA.
  2. determine the eigenvalues of ATA and sort these in descending order, in the absolute sense. Square roots these to obtain the singular values of A.
  3. Construct diagonal matrix S by placing singular values in descending order along its diagonal. Compute its inverse, S-1.
  4. use the ordered eigenvalues from step 2 and compute the eigenvectors of ATA. Place these eigenvectors along the columns of V and compute its transpose, VT.
  5. Compute U as U = AVS-1. To complete the proof, compute the full SVD using A = USVT.

Before concluding, let me mention this: in this tutorial, you have learned how SVD is applied to a matrix where m = n. This is just only one possible scenario. In general, if

  1. m = n and all singular values are greater than zero, the pseudoinverse of A is given by A-1 = VS-1UT.
  2. m < n, S is m x n with last column elements being all zero. The SVD then gives a solution with minimum norm.
  3. m > n, S is n x n, there are more equations than unknowns, and the SVD solution is least square.

Movellan discuss these cases in great detail.

So, how does SVD is used in Latent Semantic Indexing (LSI)?

In LSI, it is not the intent to reconstruct A. The goal is to find the best rank k approximation of A that would improve retrieval. The selection of k and the number of singular values in S to use is still an open area of research. During her ternure at Bellcore (now Telcordia), Microsoft's Susan Dumais mentioned in the 1995 presentationTranscription of the Application that her research group experimented with k values largely "by seat of the pants".

Early studies with the MED database using few hundred documents and dozen of queries indicate that performance versus k values are not entirely proportional, but tend to describe an inverted U-shaped curve of around k = 100. These results might change at other experimental conditions. At the time of writing the optimum k values are still determined via trial-and-error experimentation.

Now that we have the basic calculations out of the way, let's move forward and and learn how LSI scores documents and queries. It is time to demystify these calculations. Wait for Part 4 and see.

This is getting exciting.

Tutorial Review

For the matrix

Rank 2 Example
  1. Compute the eigenvalues of ATA.
  2. Prove that this is a matrix of Rank 2.
  3. Compute its full SVD.
  4. Compute its Rank 2 Approximation.
References


相關推薦

SVD LSI教程3 計算矩陣全部奇異

/**********************作者資訊****************/ Dr. E. Garcia Mi Islita.com Email | Last Update: 01/07/07 /**********************

SVD LSI 教程4 LSI計算

/**********************作者資訊****************/ Dr. E. Garcia Mi Islita.com Email | Last Update: 01/07/07 /**********************作者

SVD LSI教程5LSI關鍵字研究協同理論

/**********************作者資訊****************/ Dr. E. Garcia Mi Islita.com Email | Last Update: 01/07/07 /**********************作者

Nginx 教程3SSL 設定

大家好!分享即關懷,我們很樂意與你分享其他的一些知識。我們準備了一個 Nginx 指南,分為三個系列。如果你已經知道一些 Nginx 知識或者想擴充套件你的經驗和認知,這個再合適不過了。 我們將告訴你 Nginx 的運作模式、蘊含的概念,怎樣通過調優 Nginx 來

Project 2013專案管理教程3建立任務間的依賴性

多個任務依賴於同一個任務,比較好處理,只要把它們的predecessors都設定為這個任務的ID即可。但是,反過來,一個任務依賴於多個任務怎麼設定?其實也很簡單,就是我們上面提到的第三種方法:單擊一下,鍵盤輸入多個ID數字,中間用逗號隔開即可。比如,任務7依賴於9和2的完成,那麼在任務7後面輸入“2,9”即

spring 核心原始碼解析3AOP如何使用

AOP,面向切面程式設計,是Spring的另一核心。 在剖析AOP的實現原理前,需要先對如何使用AOP做一番探索,本節仍然使用spring boot作為實驗框架。 首先說明AOP的使用場景:日誌記錄,事務記錄等。 即可以看出,AOP的使用方式,採取類似注入

演算法資料結構3基本資料結構——連結串列,棧,佇列,有根樹

原本今天是想要介紹堆排序的。雖然堆排序需要用到樹,但基本上也就只需要用一用樹的概念,而且還只需要完全二叉樹,實際的實現也是用陣列的,所以原本想先把主要的排序演算法講完,只簡單的說一下樹的概念。但在寫的過程中才發現,雖然是隻用了一下樹的概念,但要是樹的概念沒講明白的話,其實不太好理解。所以決定先介紹一下基本的資

OpenCV Python教程345 直方圖的計算顯示 形態學處理 初級濾波內

OpenCV Python教程(3、直方圖的計算與顯示) 本篇文章介紹如何用OpenCV Python來計算直方圖,並簡略介紹用NumPy和Matplotlib計算和繪製直方圖 直方圖的背景知識、用途什麼的就直接略過去了。這裡直接介紹方法。 計算並顯

springCloud3微服務的註冊發現Eureka

springcloud 微服務的註冊與發現 eureka 一、簡介服務消費者需要一個強大的服務發現機制,服務消費者使用這種機制獲取服務提供者的網絡信息。即使服務提供者的信息發生變化,服務消費者也無須修改配置。服務提供者、服務消費者、服務發現組件三者之間的關系大致如下: 1.各個微服務在啟動時,將自

python基礎3輸入輸出運算符

http 占位符 str png blog 方法 image 16px 提示 今天總結一下最基礎的輸入輸出和運算符 輸入: python3裏都是input("") input() name = input() #輸入的值會直接賦值給name name = i

Git 教程倉庫分支

ide 不但 clas version span 右上角 director discard pre 遠程倉庫 到目前為止,我們已經掌握了如何在Git倉庫裏對一個文件進行時光穿梭,你再也不用擔心文件備份或者丟失的問題了。 可是有用過集中式版本控制系統SVN的童鞋會站出來說,這

痞子衡嵌入式第一本Git命令教程3- 編輯(status/add/rm/mv)

this 通知 一次 ranch card use div 添加文件 app   今天是Git系列課程第三課,前兩課我們都是在做Git倉庫準備工作,今天痞子衡要講的是Git本地提交前的準備工作。   本地有了倉庫,我們便可以在倉庫所在目錄下做文件增刪改操作,這些操作默認都

《Linux學習並不難》Linux網絡命令3ping命令測試目標計算機之間的連通性

Linux ping 測試 27.3 《Linux學習並不難》Linux網絡命令(3):ping命令測試與目標計算機之間的連通性使用ping命令可以用來測試與目標計算機之間的連通性。執行ping命令會使用ICMP傳輸協議發出要求回應的信息,如果遠程主機的網絡功能沒有問題,就會回應該信息,因而得知

繼承派生3多繼承

多繼承可以看作是單繼承的擴 展。所謂多繼承是指派生類具有多個基類,派生類與每個基類之間的關係仍可看作是一個單繼承。  • 多繼承是指派生類可以有一個以上的直接基類。多繼承的派 生類定義格式為: class <派生類名>: [<繼承方式>] <基類名

Java併發程式設計3執行緒掛起、恢復終止的正確方法含程式碼

JAVA大資料中高階架構 2018-11-06 14:24:56掛起和恢復執行緒Thread 的API中包含兩個被淘汰的方法,它們用於臨時掛起和重啟某個執行緒,這些方法已經被淘汰,因為它們是不安全的,不穩定的。如果在不合適的時候掛起執行緒(比如,鎖定共享資源時),此時便可能會發生死鎖條件——其他執行緒在等待該

C語言面向物件程式設計虛擬函式多型3

 在《 C++ 程式設計思想》一書中對虛擬函式的實現機制有詳細的描述,一般的編譯器通過虛擬函式表,在編譯時插入一段隱藏的程式碼,儲存型別資訊和虛擬函式地址,而在呼叫時,這段隱藏的程式碼可以找到和實際物件一致的虛擬函式實現。     我們在這裡提供

python數字影象處理3影象畫素的訪問裁剪

圖片讀入程式中後,是以numpy陣列存在的。因此對numpy陣列的一切功能,對圖片也適用。對陣列元素的訪問,實際上就是對圖片畫素點的訪問。 彩色圖片訪問方式為: img[i,j,c] i表示圖片的行數,j表示圖片的列數,c表示圖片的通道數(RGB三通道分別對應0,1

【翻譯】CodeMix使用教程任務tasks.json

CodeMix中的任務與tasks.json 工具(如編譯器,連結器和構建系統)用於自動化構建,執行測試和部署等過程。 雖然這些工具通常從IDE外部的命令列執行,但在Tasks支援下,可以在IDE中執行這些程序。 對於執行構建和驗證的工具,這些工具報告的問題由CodeMix選取並顯示在IDE中

Navicat使用教程使用MySQL日誌3部分——慢速日誌

下載Navicat for MySQL最新版本 Navicat for MySQL 是一套管理和開發 MySQL 或 MariaDB 的理想解決方案。使用Navicat for MySQL可以同時連線到 MySQL 和 MariaDB。Navicat for MySQL提供了強大的前端功能,為

新手學python3yield序列化

1 Yield生成器        Yield是我在其他語言中沒有見過的一個屬性,算是python的一大特色,用好之後可以使程式碼更簡潔。考慮一個簡單的例子,檔案的遍歷。要遍歷一個目錄下的所有檔案需要遞迴的操作。如果我們只是單純的列印檔名,我們可以在遞迴的過程中完成,每當發