php7中使用mongoDB的聚合操作對資料進行分組求和統計操作
阿新 • • 發佈:2018-12-24
本文將介紹mongoDB使用aggregate對資料分組,求和。給出shell命令列寫法,php7中的寫法,也將給出相同資料結構mysql命令列寫法。
mongoDB collection a_test 中資料:
> db.a_test.find()
{ "_id" : ObjectId("59a2431b57416663f0330a99"), "name" : "jack", "age" : 16, "sex" : "male" }
{ "_id" : ObjectId("59a2432f57416663f0330a9a"), "name" : "lucy", "age" : 16 , "sex" : "female" }
{ "_id" : ObjectId("59a2433c57416663f0330a9b"), "name" : "mike", "age" : 17, "sex" : "male" }
{ "_id" : ObjectId("59a2434857416663f0330a9c"), "name" : "lili", "age" : 18, "sex" : "female" }
{ "_id" : ObjectId("59a2782657416663f0330a9d"), "name" : "jane", "age" : 17, "sex" : "female" }
- mysql table a_test 中資料:
mysql> select * from a_test;
+----+------+------+--------+
| id | name | age | sex |
+----+------+------+--------+
| 3 | jack | 16 | male |
| 4 | lucy | 16 | female |
| 5 | mike | 17 | male |
| 6 | lili | 18 | female |
| 7 | jane | 17 | female |
+----+------+------+--------+
一、簡單操作操作操作
1.對sex欄位分組,然後對age欄位求和
(a) mongodb:
shell 命令列:
> db.a_test.aggregate([{"$group":{"_id":{sex:"$sex"}, 'age_count':{'$sum':'$age'}}}])
{ "_id" : { "sex" : "female" }, "age_count" : 51 }
{ "_id" : { "sex" : "male" }, "age_count" : 33 }
php7:
$pipe = array(
array(
'$group' => array(
'_id' => array('sex' => '$sex'),'age_count' => array('$sum' => '$age'),
),
),
array(
'$project' => array(
'sex' => '$_id.sex',
'age_count' => '$age_count',
),
),
);
$cursor = LogDB::selectCollection('a_test')->aggregate($pipe);
foreach ($cursor as $value) {
print_r($value);
}
輸出列印:
MongoDB\Model\BSONDocument Object
(
[storage:ArrayObject:private] => Array
(
[_id] => MongoDB\Model\BSONDocument Object
(
[storage:ArrayObject:private] => Array
(
[sex] => female
)
)
[age_count] => 51
[sex] => female
)
)
MongoDB\Model\BSONDocument Object
(
[storage:ArrayObject:private] => Array
(
[_id] => MongoDB\Model\BSONDocument Object
(
[storage:ArrayObject:private] => Array
(
[sex] => male
)
)
[age_count] => 33
[sex] => male
)
)
(b) mysql命令列:
mysql> select sex,sum(`age`) as `age_count` from `a_test` group by `sex`;
+--------+-----------+
| sex | age_count |
+--------+-----------+
| female | 51 |
| male | 33 |
+--------+-----------+
由上可見sex為female的age總和為:55,sex為male的age總和為:33
2.根據欄位sex分組統計資料條數
(a) mongoDB shell:
> db.a_test.aggregate([{"$group":{"_id":{sex:"$sex"}, 'age_count':{'$sum':1}}}])
{ "_id" : { "sex" : "female" }, "age_count" : 3 }
{ "_id" : { "sex" : "male" }, "age_count" : 2 }
php7:
$pipe = array(
array(
'$group' => array(
'_id' => array('sex' => '$sex'),'count' => array('$sum' => 1),
),
),
array(
'$project' => array(
'sex' => '$_id.sex',
'count' => '$count',
),
),
);
$cursor = LogDB::selectCollection('a_test')->aggregate($pipe);
foreach ($cursor as $value) {
print_r($value);
}
列印輸出:
MongoDB\Model\BSONDocument Object
(
[storage:ArrayObject:private] => Array
(
[_id] => MongoDB\Model\BSONDocument Object
(
[storage:ArrayObject:private] => Array
(
[sex] => female
)
)
[count] => 3
[sex] => female
)
)
MongoDB\Model\BSONDocument Object
(
[storage:ArrayObject:private] => Array
(
[_id] => MongoDB\Model\BSONDocument Object
(
[storage:ArrayObject:private] => Array
(
[sex] => male
)
)
[count] => 2
[sex] => male
)
)
(b) mysql 命令列:
mysql> select sex,count(*) as `count` from `a_test` group by `sex`;
+--------+-------+
| sex | count |
+--------+-------+
| female | 3 |
| male | 2 |
+--------+-------+
由上可見sex為female的資料一共有3條,為male的共有2條
二、下面我們使用mongoDB做更復雜一點點的操作。同樣給出相同資料結構mysql的命令列。
- mongoDB collection b_test中資料:
> db.b_test.find()
{ "_id" : ObjectId("59a288eacc90a3fcee840637"), "user" : "jack", "game" : "game-1", "date" : 20170826, "game_type" : "online" }
{ "_id" : ObjectId("59a288f2cc90a3fcee840638"), "user" : "jack", "game" : "game-1", "date" : 20170826, "game_type" : "online" }
{ "_id" : ObjectId("59a2890ecc90a3fcee840639"), "user" : "jack", "game" : "game-2", "date" : 20170826, "game_type" : "alone" }
{ "_id" : ObjectId("59a28927cc90a3fcee84063a"), "user" : "mike", "game" : "game-1", "date" : 20170826, "game_type" : "online" }
{ "_id" : ObjectId("59a28938cc90a3fcee84063b"), "user" : "lili", "game" : "game-3", "date" : 20170820, "game_type" : "online" }
- mysql table b_test中資料:
``shell
b_test`;
mysql> select * from
+—-+——+——–+———-+———–+
| id | user | game | date | game_type |
+—-+——+——–+———-+———–+
| 1 | jack | game-1 | 20170826 | online |
| 2 | jack | game-1 | 20170826 | online |
| 3 | jack | game-2 | 20170826 | alone |
| 4 | mike | game-1 | 20170826 | online |
| 5 | lili | game-3 | 20170820 | online |
+—-+——+——–+———-+———–+
- 假設b_test是中使用者遊戲的登入日誌。mysql和mongodb中date欄位都為int型別,當然一般情況下存時間戳。
1.統計20170820以後每個遊戲登入使用者總數(同一帳號重複登入只算一次)
(a)mongoDB shell:
先找出20170820以後的使用者並去重複:
db.b_test.aggregate([{'$match':{'date':{'$gt':20170820}}}, {'$group':{'_id':{'user':'$user','game':'$game'}}}]);
{ "_id" : { "user" : "mike", "game" : "game-1" } }
{ "_id" : { "user" : "jack", "game" : "game-2" } }
{ "_id" : { "user" : "jack", "game" : "game-1" } }
然後按照遊戲統計:
> db.b_test.aggregate([{'$match':{'date':{'$gt':20170820}}}, {'$group':{'_id':{'user':'$user','game':'$game'}}},{'$group':{'_id':{'game':'$_id.game'}, count:{'$sum':1}}}]);
{ "_id" : { "game" : "game-2" }, "count" : 1 }
{ "_id" : { "game" : "game-1" }, "count" : 2 }
php7:
$pipe = array(
array(
'$match' => array(
'date' => array('$gt' => 20170820),
),
),
array(
'$group' => array(
'_id' => array('user' => '$user', 'game' => '$game'),
),
),
array(
'$group' => array(
'_id' => array('game' => '$_id.game'),
'count' => array('$sum' => 1),
),
),
array(
'$project' => array(
'game' => '$_id.game',
'count' => '$count',
),
),
);
$cursor = LogDB::selectCollection('b_test')->aggregate($pipe);
foreach ($cursor as $value) {
print_r($value);
}
列印輸出:
MongoDB\Model\BSONDocument Object
(
[storage:ArrayObject:private] => Array
(
[_id] => MongoDB\Model\BSONDocument Object
(
[storage:ArrayObject:private] => Array
(
[game] => game-2
)
)
[count] => 1
[game] => game-2
)
)
MongoDB\Model\BSONDocument Object
(
[storage:ArrayObject:private] => Array
(
[_id] => MongoDB\Model\BSONDocument Object
(
[storage:ArrayObject:private] => Array
(
[game] => game-1
)
)
[count] => 2
[game] => game-1
)
)
(b) mysql 命令列
先對去重複:
mysql> select * from b_test where `date` > 20170820 group by `user`,`game`;
+----+------+--------+----------+-----------+
| id | user | game | date | game_type |
+----+------+--------+----------+-----------+
| 1 | jack | game-1 | 20170826 | online |
| 3 | jack | game-2 | 20170826 | alone |
| 4 | mike | game-1 | 20170826 | online |
+----+------+--------+----------+-----------+
然後統計各個遊戲總人數:
mysql> select `game`,count(*) as `count`,`game_type` from (select * from b_test where `date` > 20170820 group by `user`,`game`) as dictint group by `game`;
+--------+-------+-----------+
| game | count | game_type |
+--------+-------+-----------+
| game-1 | 2 | online |
| game-2 | 1 | alone |
+--------+-------+-----------+
疑惑:mysql中可以select選擇保留欄位,mongodb可以通過project保留上次操作後的欄位。二(1)中如何保留欄位game_type?嘗試了一下,知道的朋友請留言。謝謝!