MongoDB 及 PyMongo 的基本用法

1,316 阅读2分钟

本文介绍了如何使用 MongoDB 的 CURD, 及使用 PyMongo进行相应操作, 是一篇扫盲贴.

1. 安装与运行

Mac
# 安装
brew tap mongodb/brew
brew install mongodb-community@4.0

# 启动
brew services start mongodb-community@4.0

# 日志文件
/usr/local/var/log/mongodb/mongo.log

# 配置文件
/usr/local/etc/mongod.conf
Linux(ubuntu)
# 安装
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 9DA31620334BD75D9DCB49F368818C72E52529D4
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/4.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-4.0.list
sudo apt-get update
sudo apt-get install -y mongodb-org

# 启动/停止/重启
sudo service mongod start
sudo service mongod stop
sudo service mongod restart

# 日志文件
/var/log/mongodb/mongod.log

# 配置文件
/etc/mongod.conf
Windows
学技术还是用Linux吧, 虚拟机装ubuntu.
mongodb 不支持 WSL.

2. 概念

在使用mongodb前我们先理一下概念, 与关系型数据库不同, mongodb是非关系型文档数据库. 它没有外键的说法, 保存的是 BSON(类似JSON的数据结构).

从pymongo拿来的示意图很好的解释了它的数据结构.

与关系型数据库的概念对比

关系型概念 MongoDB 等价概念
Database Database
Tables Collections
Rows Documents
Index Index

最大的区别是我们不再受表结构的限制, 可以写入各种json. 反正我们操作的就是一条记录麻, 不管一条条row, 还是一个个json document(这是一把双刃剑).

3. Mongo Shell

我们可以进入 mongo shell 操作 mongodb. 也可以另行下载带界面的客户端, 但自带的工具已经够用.

进入 mongo shell, 列出db, collections.

$ mongo
> show databases 
admin   0.000GB
config  0.000GB
local   0.000GB
test    0.000GB

> use test # 使用指定db
switched to db test

> db # 显示当前db
test

> show collections # 列出当前db下的collection

mongo的db和collection不需要显式 create, 它们会在你插入时自动生成.

> use demo # 切换db, 切换db即创建
> db.inventory.insert({first_doc: "This is my first doc."})
WriteResult({ "nInserted" : 1 })

> show dbs # 这个demo db会在插入数据时自动创建
...
demo
...

> show collections # 这个collection会在插入数据时自动创建
inventory

> db.inventory.find({})
{ "_id" : ObjectId("5c9ad59a52fc8581a5707ce9"), "first_doc" : "This is my first doc." }

> db.inventory.drop() # 删除collection
> db.dropDatabase() # 删除db
insert

命令:

db.collection.insertOne() # 插入一条collection
db.collection.insertMany() # 插入多条
db.collection.insert() # 插入一或多条

举例:

db.inventory.insertOne(
   { item: "canvas", qty: 100, tags: ["cotton"], size: { h: 28, w: 35.5, uom: "cm" } }
)

db.inventory.insertMany([
   { item: "journal", qty: 25, tags: ["blank", "red"], size: { h: 14, w: 21, uom: "cm" } },
   { item: "mat", qty: 85, tags: ["gray"], size: { h: 27.9, w: 35.5, uom: "cm" } },
   { item: "mousepad", qty: 25, tags: ["gel", "blue"], size: { h: 19, w: 22.85, uom: "cm" } }
])

db.inventory.insert([
   { item: "journal", qty: 25, tags: ["blank", "red"], size: { h: 14, w: 21, uom: "cm" } },
   { item: "mat", qty: 85, tags: ["gray"], size: { h: 27.9, w: 35.5, uom: "cm" } },
   { item: "mousepad", qty: 25, tags: ["gel", "blue"], size: { h: 19, w: 22.85, uom: "cm" } }
])
query

命令:

db.collection.findOne() # 查询一条
db.collection.findMany() # 查询多条
db.collection.find() # 查询一或多条

举例

db.inventory.find({}) # 相当于select * from inventory;

db.inventory.find({status: "D"})
# select * from inventory where status = 'D';

db.inventory.find({status: {$in: ["A", "D"]}})
# select * from inventory where status in ("A", "D");

db.inventory.find({status: "A", qty: {$lt: 30}})
# select * from inventory where status = 'A' and qty < 30;

db.inventory.find({$or: [{status: "A"}, {qty: {$lt: 30}}]})
# select * from inventory whre status = 'A' or qty < 30;

db.inventory.find({status: "A", $or: [{qty: {$lt: 30}}, {item: /^p/}]})
# select * from inventory where status = 'A' AND (qty < 30 OR item like 'p%');

操作符

: # =
$lt # <
$lte # <=
$gt # >
$gte # >=
$in # in
$or # or
{}, {} # ,是and
update

命令:

db.collection.updateOne() # 更新一条
db.collection.updateMany() # 更新多条
db.collection.replaceOne() # 替换一条

举例

db.inventory.updateOne(
   { item: "journal" },
   {
     $set: { "size.uom": "in", status: "P" },
     $currentDate: { lastModified: true }
   }
)
# update inventory set size.uom='cm', status='P', 
# lastModified=NOW() where id in (
#    select id from (
#        select id from inventory where item ='paper' limit 1
#    ) tmp
# );
# 为了让等价sql的正确性, 不得不写很长.
# 实际这里就是只修改一条记录的意思.

db.inventory.updateMany(
   { "qty": {$lt: 30}},
   {
     $set: { "size.uom": "cm", status: "A" },
     $currentDate: { lastModified: true }
   }
)
# 修改所有 qty < 30 的记录.

db.inventory.replaceOne(
   { item: "mat" },
   { item: "paper", instock: [ { warehouse: "A", qty: 60 }, { warehouse: "B", qty: 40 } ] }
)
# 匹配 item = mat 的一条记录替换为后面的 {item: ...}

操作符

$set # update set x = n
$inc # update set x = x + n
delete

命令:

db.collection.deleteOne() # 删除一条
db.collection.deleteMany() # 删除一条

举例

db.inventory.deleteOne({status:"A"}) # 匹配条件,删一条
db.inventory.deleteMany({status:"A"}) # 删除所有匹配条件的记录
db.inventory.deleteMany({}) # 全删所有inventory下的document
aggregate pipeline

聚合管道, 意思就是把多个聚合操作一步步执行.

db.inventory.aggregate([
    {$match: {status: "A"}},
    {$group: {_id: "$item", total: {$sum: "$qty"}}}
])

# select sum('qty') as total, item as _id from inventory 
# where status = 'A' group by item;

{ "_id" : "notebook", "total" : 50 }
{ "_id" : "postcard", "total" : 45 }
{ "_id" : "journal", "total" : 25 }

4. PyMongo

PyMongo是官网库, 熟悉完 mongo 的原生命令, 使用pymongo就是类推.

install
pip install pymongo
client
from pymongo import MongoClient

client = MongoClient()
db = client.test # 这里用回测试的db

# 测试一下, 把之前插入的数据全部输出
result = db.inventory.find({})
for d in result:
    print(d)

insert
names = ['JD','Ali','Ten', 'Bd', 'YZ']
company_area = ['Shengzhen', 'BeiJing', 'Shanghai', 'Hangzhou']
company_department = ['BD','Dev','Opt', 'UI', 'PM', 'QA']
company_worktime = ['996', '9116', '885', '997', '975']
docs = []
for x in range(1, 101):
    doc = {
        'name' : names[randint(0, (len(names)-1))],
        'area': company_area[randint(0, (len(company_area)-1))],
        'department': company_department[randint(0, (len(company_department)-1))],
        'total' : randint(1, 10),
        'worktime' : company_worktime[randint(0, (len(company_worktime)-1))] 
    }
    docs.append(doc)
db.reviews.insert_many(docs)

query
# 聚合查询并按公司名, 部门排序

ret = db.reviews.aggregate([
    {'$group': {'_id': {'name':'$name', 'department': '$department'}, 'count': {'$sum': '$total'}}}, 
    {'$sort': {'_id.name': -1, '_id.department': -1}}
])

for r in ret:
    print(r)

update
db.reviews.update_many({'name': 'Ali', 'area':'Hangzhou', 'department':'QA'}, 
{'$inc': {'total': 1}
})

result = db.reviews.find({'name': 'Ali', 'area':'Hangzhou', 'department':'QA'})
for r in result:
    print(r)
delete
db.reviews.delete_many({'name': 'Ali', 'area':'Hangzhou', 'department':'QA'})

result = db.reviews.find({'name': 'Ali', 'area':'Hangzhou', 'department':'QA'})
for r in result:
    print(r)

pymongo 的用法跟 mondo shell 几乎是一致的, 稍微转换一下能用.