在Linux中通過Kitchen和Pan以命令列方式執行kettle的Job和Transformation
阿新 • • 發佈:2019-01-22
1. 準備工作
一個簡單的job,一個簡單的trans。
本處為了方便和效果易見,job和trans都生成檔案。
trans:讀取download目錄下的所有檔名,輸出為檔案。【介面情況下測試成功】
成功生成目標檔案:
job:建立檔案。【介面模式測試執行成功】
執行結果:
把介面執行測試結果檔案刪除,以免影響觀察。
2. linux環境以命令列方式執行job和trans
Pan是用於執行trans的PDI命令列工具。
Kitchen是用於執行作業的PDI命令列工具。
a. Pan的命令列選項和語法
語法:
pan.sh -option=value arg1 arg2
命令列引數:
Switch | Purpose |
---|---|
rep | Enterprise or database repository name, if you are using one |
user | Repository username |
pass | Repository password |
trans | The name of the transformation (as it appears in the repository) to launch |
dir | The repository directory that contains the transformation, including the leading slash |
file | If you are calling a local KTR file, this is the filename, including the path if it is not in the local directory |
level | The logging level (Basic, Detailed, Debug, Rowlevel, Error, Nothing) |
logfile | A local filename to write log output to |
listdir | Lists the directories in the specified repository |
listtrans | Lists the transformations in the specified repository directory |
listrep | Lists the available repositories |
exprep | Exports all repository objects to one XML file |
norep | Prevents Pan from logging into a repository. If you have set the KETTLE_REPOSITORY, KETTLE_USER, and KETTLE_PASSWORD environment variables, then this option will enable you to prevent Pan from logging into the specified repository, assuming you would like to execute a local KTR file instead. |
safemode | Runs in safe mode, which enables extra checking |
version | Shows the version, revision, and build date |
param | Set a named parameter in a name=value format. For example: -param:FOO=bar |
listparam | List information about the defined named parameters in the specified transformation. |
maxloglines | The maximum number of log lines that are kept internally by PDI. Set to 0 to keep all rows (default) |
maxlogtimeout | The maximum age (in minutes) of a log line while being kept internally by PDI. Set to 0 to keep all rows indefinitely (default) |
示例:
sh pan.sh -rep=initech_pdi_repo -user=pgibbons -pass=lumburghsux -trans=TPS_reports_2011
本地trans呼叫示例:
./pan.sh -file=/home/hadoop/workplace/kettle/trans/test_cml.ktr -norep
b.Kitchen的命令列引數及語法:
語法與Pan一樣,引數有點不同。
Switch | urpose |
---|---|
rep | Enterprise or database repository name, if you are using one |
user | Repository username |
pass Repository | password |
job | The name of the job (as it appears in the repository) to launch |
dir | The repository directory that contains the job, including the leading slash |
file | If you are calling a local KJB file, this is the filename, including the path if it is not in the local directory |
level | The logging level (Basic, Detailed, Debug, Rowlevel, Error, Nothing) |
logfile | A local filename to write log output to |
listdir | Lists the sub-directories within the specified repository directory |
listjob | Lists the jobs in the specified repository directory |
listrep | Lists the available repositories |
export | Exports all linked resources of the specified job. The argument is the name of a ZIP file. |
norep | Prevents Kitchen from logging into a repository. If you have set the KETTLE_REPOSITORY, KETTLE_USER, and KETTLE_PASSWORD environment variables, then this option will enable you to prevent Kitchen from logging into the specified repository, assuming you would like to execute a local KTR file instead. |
version | Shows the version, revision, and build date |
param | Set a named parameter in a name=value format. For example: -param:FOO=bar |
listparam | List information about the defined named parameters in the specified job. |
maxloglines | The maximum number of log lines that are kept internally by PDI. Set to 0 to keep all rows (default) |
maxlogtimeout | The maximum age (in minutes) of a log line while being kept internally by PDI. Set to 0 to keep all rows indefinitely (default) |
執行本地job的命令列語句:
/home/kettle/data-integration/kitchen.sh -file=/home/kettle/transition/move.kjb -log=log.log
形式:
$kitchen路徑 -file=$job路徑 log=$log路徑
呼叫pan結果:
呼叫kitchen結果:
3.個人常用命令選項
由於我當前的工作環境都是執行本地的job和trans檔案,所以常用的命令選項有:
命令 | 描述 |
---|---|
-file | job或trans檔案路徑 |
-norep | 標明不是資源庫裡的檔案 |
-param | 引數設定 |
-logfile | log輸出檔名 |
-level | log級別 (Basic, Detailed, Debug, Rowlevel, Error, Nothing) |