章节一:mongodb在大数据中的实践-数据怎么写入?

886 阅读1分钟

在这一章中,我们先以一个简单例子来演示下mongodb数据的存储。

环境准备

mongodb服务端下载地址:
www.mongodb.com/download-ce…

robomongo客户端下载地址:
robomongo.org/download

连接配置

详细配置信息如下:

单条记录的写入

执行mongodb指令如下:
db.test.save({
    "city_id" : "298",
    "city_name" : "阿坝",
    "lat" : 31.9055115772665,
    "lng" : 102.23141546175
})
或者
db.test.insert({
    "city_id" : "298",
    "city_name" : "阿坝",
    "lat" : 31.9055115772665,
    "lng" : 102.23141546175
})

在指定_id主键的情况下,insert如果插入的主键已经存在,会抛出异常:
(E11000 duplicate key error collection: logs.test index: id dup key: { : "298" }
save则会更新库中对应该_id主键的数据

执行结果如下:

{
    "_id" : ObjectId("5e1d8e64876551cd96cb576d"),
    "city_id" : "298",
    "city_name" : "阿坝",
    "lat" : 31.9055115772665,
    "lng" : 102.23141546175
}

在未指定_id主键情况下,mongodb会自动生成ObjectId类型主键

多条记录的写入

执行mongodb指令:
db.test.insertMany([{
    "city_id" : "299",
    "city_name" : "阿坝1",
    "lat" : 31.9055115772665,
    "lng" : 102.23141546175
},{
    "city_id" : "300",
    "city_name" : "阿坝2",
    "lat" : 31.9055115772665,
    "lng" : 102.23141546175
}])
返回结果:  
{
    "acknowledged" : true,
    "insertedIds" : [ 
        ObjectId("5e1d9397876551cd96cb576f"), 
        ObjectId("5e1d9397876551cd96cb5770")
    ]
}
如果写入的多条记录中存在_id主键,同时存在重复主键,则重复记录之后的记录会写入失败。
如需解决如上问题,可以使用python脚本功能实现:

from pymongo import UpdateOne
collection_test = mongo_util.get_clllection('test', 'test')
list_result = [{
    "_id" : "299",
    "city_id" : "299",
    "city_name" : "阿坝1",
    "lat" : 31.9055115772665,
    "lng" : 102.23141546175
},{
    "_id" : "301",
    "city_id" : "301",
    "city_name" : "阿坝3",
    "lat" : 31.9055115772665,
    "lng" : 102.23141546175
}]
write_list = []
for res in list_result:
    write_list.append(UpdateOne({'_id': res['_id']}, {'$set': res}, upsert=True))
collection_test.bulk_write(write_result, ordered=False)

如上,如果批量写入的数据中主键_id已经存在了,则会执行更行命令,否则执行保存命令。
"""
    mongo_util mongodb连接工具类
"""
from pymongo import MongoClient
connect_url = 'mongodb://root:123456@127.0.0.1:27017/admin'
client = MongoClient(connect_url, connect=False)

def get_collection(dbname, cllectionname):
    """
    获取连接
    :param dbname: 数据库名称
    :param cllectionname: 表名称
    :return:
    """
    db = client[dbname]
    collection_useraction = db[cllectionname]
    return collection_useraction