1. 程式人生 > >php7中使用mongoDB的聚合操作對資料進行分組求和統計操作

php7中使用mongoDB的聚合操作對資料進行分組求和統計操作

  • 本文將介紹mongoDB使用aggregate對資料分組,求和。給出shell命令列寫法,php7中的寫法,也將給出相同資料結構mysql命令列寫法。

  • mongoDB collection a_test 中資料:

> db.a_test.find()
{ "_id" : ObjectId("59a2431b57416663f0330a99"), "name" : "jack", "age" : 16, "sex" : "male" }
{ "_id" : ObjectId("59a2432f57416663f0330a9a"), "name" : "lucy", "age" : 16
, "sex" : "female" } { "_id" : ObjectId("59a2433c57416663f0330a9b"), "name" : "mike", "age" : 17, "sex" : "male" } { "_id" : ObjectId("59a2434857416663f0330a9c"), "name" : "lili", "age" : 18, "sex" : "female" } { "_id" : ObjectId("59a2782657416663f0330a9d"), "name" : "jane", "age" : 17, "sex" : "female" }
  • mysql table a_test 中資料:
mysql> select * from a_test;
+----+------+------+--------+
| id | name | age  | sex    |
+----+------+------+--------+
|  3 | jack |   16 | male   |
|  4 | lucy |   16 | female |
|  5 | mike |   17 | male   |
|  6 | lili |   18 | female |
|  7 | jane |   17 | female |
+----+------+------+--------+

一、簡單操作操作操作

1.對sex欄位分組,然後對age欄位求和
(a) mongodb:
shell 命令列:

> db.a_test.aggregate([{"$group":{"_id":{sex:"$sex"}, 'age_count':{'$sum':'$age'}}}])
{ "_id" : { "sex" : "female" }, "age_count" : 51 }
{ "_id" : { "sex" : "male" }, "age_count" : 33 }

php7:

        $pipe = array(
            array(
                '$group' => array(
                    '_id' => array('sex' => '$sex'),'age_count' => array('$sum' => '$age'),
                ),
            ),
            array(
                '$project' => array(
                    'sex' => '$_id.sex',
                    'age_count' => '$age_count',
                ),
            ),
        );
        $cursor = LogDB::selectCollection('a_test')->aggregate($pipe);
        foreach ($cursor as $value) {
            print_r($value);
        }

輸出列印:

MongoDB\Model\BSONDocument Object
(
    [storage:ArrayObject:private] => Array
        (
            [_id] => MongoDB\Model\BSONDocument Object
                (
                    [storage:ArrayObject:private] => Array
                        (
                            [sex] => female
                        )

                )

            [age_count] => 51
            [sex] => female
        )

)
MongoDB\Model\BSONDocument Object
(
    [storage:ArrayObject:private] => Array
        (
            [_id] => MongoDB\Model\BSONDocument Object
                (
                    [storage:ArrayObject:private] => Array
                        (
                            [sex] => male
                        )

                )

            [age_count] => 33
            [sex] => male
        )

)

(b) mysql命令列:

mysql> select sex,sum(`age`) as `age_count` from `a_test` group by `sex`;
+--------+-----------+
| sex    | age_count |
+--------+-----------+
| female |        51 |
| male   |        33 |
+--------+-----------+

由上可見sex為female的age總和為:55,sex為male的age總和為:33

2.根據欄位sex分組統計資料條數
(a) mongoDB shell:

> db.a_test.aggregate([{"$group":{"_id":{sex:"$sex"}, 'age_count':{'$sum':1}}}])
{ "_id" : { "sex" : "female" }, "age_count" : 3 }
{ "_id" : { "sex" : "male" }, "age_count" : 2 }

php7:

        $pipe = array(
            array(
                '$group' => array(
                    '_id' => array('sex' => '$sex'),'count' => array('$sum' => 1),
                ),
            ),
            array(
                '$project' => array(
                    'sex' => '$_id.sex',
                    'count' => '$count',
                ),
            ),
        );
        $cursor = LogDB::selectCollection('a_test')->aggregate($pipe);
        foreach ($cursor as $value) {
            print_r($value);
        }

列印輸出:

MongoDB\Model\BSONDocument Object
(
    [storage:ArrayObject:private] => Array
        (
            [_id] => MongoDB\Model\BSONDocument Object
                (
                    [storage:ArrayObject:private] => Array
                        (
                            [sex] => female
                        )

                )

            [count] => 3
            [sex] => female
        )

)
MongoDB\Model\BSONDocument Object
(
    [storage:ArrayObject:private] => Array
        (
            [_id] => MongoDB\Model\BSONDocument Object
                (
                    [storage:ArrayObject:private] => Array
                        (
                            [sex] => male
                        )

                )

            [count] => 2
            [sex] => male
        )

)

(b) mysql 命令列:

mysql> select sex,count(*) as `count` from `a_test` group by `sex`;
+--------+-------+
| sex    | count |
+--------+-------+
| female |     3 |
| male   |     2 |
+--------+-------+

由上可見sex為female的資料一共有3條,為male的共有2條

二、下面我們使用mongoDB做更復雜一點點的操作。同樣給出相同資料結構mysql的命令列。

  • mongoDB collection b_test中資料:
> db.b_test.find()
{ "_id" : ObjectId("59a288eacc90a3fcee840637"), "user" : "jack", "game" : "game-1", "date" : 20170826, "game_type" : "online" }
{ "_id" : ObjectId("59a288f2cc90a3fcee840638"), "user" : "jack", "game" : "game-1", "date" : 20170826, "game_type" : "online" }
{ "_id" : ObjectId("59a2890ecc90a3fcee840639"), "user" : "jack", "game" : "game-2", "date" : 20170826, "game_type" : "alone" }
{ "_id" : ObjectId("59a28927cc90a3fcee84063a"), "user" : "mike", "game" : "game-1", "date" : 20170826, "game_type" : "online" }
{ "_id" : ObjectId("59a28938cc90a3fcee84063b"), "user" : "lili", "game" : "game-3", "date" : 20170820, "game_type" : "online" }
  • mysql table b_test中資料:
    ``shell
    mysql> select * from
    b_test`;
    +—-+——+——–+———-+———–+
    | id | user | game | date | game_type |
    +—-+——+——–+———-+———–+
    | 1 | jack | game-1 | 20170826 | online |
    | 2 | jack | game-1 | 20170826 | online |
    | 3 | jack | game-2 | 20170826 | alone |
    | 4 | mike | game-1 | 20170826 | online |
    | 5 | lili | game-3 | 20170820 | online |
    +—-+——+——–+———-+———–+
- 假設b_test是中使用者遊戲的登入日誌。mysql和mongodb中date欄位都為int型別,當然一般情況下存時間戳。

1.統計20170820以後每個遊戲登入使用者總數(同一帳號重複登入只算一次)
(a)mongoDB shell:   

先找出20170820以後的使用者並去重複:   

db.b_test.aggregate([{'$match':{'date':{'$gt':20170820}}}, {'$group':{'_id':{'user':'$user','game':'$game'}}}]);
{ "_id" : { "user" : "mike", "game" : "game-1" } }
{ "_id" : { "user" : "jack", "game" : "game-2" } }
{ "_id" : { "user" : "jack", "game" : "game-1" } }
然後按照遊戲統計:
> db.b_test.aggregate([{'$match':{'date':{'$gt':20170820}}}, {'$group':{'_id':{'user':'$user','game':'$game'}}},{'$group':{'_id':{'game':'$_id.game'}, count:{'$sum':1}}}]);
{ "_id" : { "game" : "game-2" }, "count" : 1 }
{ "_id" : { "game" : "game-1" }, "count" : 2 }

php7:

        $pipe = array(
            array(
                '$match' => array(
                    'date' => array('$gt' => 20170820),
                ),
            ),
            array(
                '$group' => array(
                    '_id' => array('user' => '$user', 'game' => '$game'),
                ),
            ),
            array(
                '$group' => array(
                    '_id' => array('game' => '$_id.game'),
                    'count' => array('$sum' => 1),
                ),
            ),
            array(
                '$project' => array(
                    'game' => '$_id.game',
                    'count' => '$count',
                ),
            ),
        );
        $cursor = LogDB::selectCollection('b_test')->aggregate($pipe);
        foreach ($cursor as $value) {
            print_r($value);
        }

列印輸出:

MongoDB\Model\BSONDocument Object
(
    [storage:ArrayObject:private] => Array
        (
            [_id] => MongoDB\Model\BSONDocument Object
                (
                    [storage:ArrayObject:private] => Array
                        (
                            [game] => game-2
                        )

                )

            [count] => 1
            [game] => game-2
        )

)
MongoDB\Model\BSONDocument Object
(
    [storage:ArrayObject:private] => Array
        (
            [_id] => MongoDB\Model\BSONDocument Object
                (
                    [storage:ArrayObject:private] => Array
                        (
                            [game] => game-1
                        )

                )

            [count] => 2
            [game] => game-1
        )

)

(b) mysql 命令列
先對去重複:

mysql> select * from b_test  where `date` > 20170820 group by `user`,`game`;
+----+------+--------+----------+-----------+
| id | user | game   | date     | game_type |
+----+------+--------+----------+-----------+
|  1 | jack | game-1 | 20170826 | online    |
|  3 | jack | game-2 | 20170826 | alone     |
|  4 | mike | game-1 | 20170826 | online    |
+----+------+--------+----------+-----------+

然後統計各個遊戲總人數:

mysql> select `game`,count(*) as `count`,`game_type` from (select * from b_test  where `date` > 20170820 group by `user`,`game`) as dictint group by `game`;
+--------+-------+-----------+
| game   | count | game_type |
+--------+-------+-----------+
| game-1 |     2 | online    |
| game-2 |     1 | alone     |
+--------+-------+-----------+

疑惑:mysql中可以select選擇保留欄位,mongodb可以通過project保留上次操作後的欄位。二(1)中如何保留欄位game_type?嘗試了一下,知道的朋友請留言。謝謝!