Manage Data Using MongoDB

#Node #Crawler

The original Chinese article is written by ninthakeey. It has been translated and remixed by Datumorphism

In most cases, databases makes the management of data quite convenient. In this article, we would scrape data using the code we discussed before but write data into MongoDB.

For installation of MongoDB, please refer to the official documentation.

The Code

To write data to MongoDB using Node.js, we choose the package mongojs, which provides almost exactly the standard MongoDB syntax.

To install mongojs,

npm i mongojs --save

Here is a module that can write data to MongoDB. We create a file named dao.js and copy/paste the following code into it.

// use mongojs
const mongojs = require('mongojs')
// connect to the database 'simple_spider' in MongoDB and use collection 'test'
const localdb = mongojs('simple_spider', ['test'])
// a function that saves data to MongoDB
const saveData = (data,cb) => {, (err, res)=>{
	 	cb && cb(err, res)
// a function that prints out the data to console
const printAllData = (cb) => {
	localdb.test.find({},(err, docs)=>{
		cb && cb(err, docs)
// close connection to MongoDB in case of memory leak
const closeMongo = () => localdb.close()
// expose the functions in this module to the program
module.exports = {
() =>

In Node.js, arrow function is keystroke saver for function expressions. It comes without this, arguments, super, etc.

In the above code, the expression

const printAllData = (cb) => {
  // ...

defines a function named printAllData with argument cb and does something as indicated inside the curly bracket.

Three functions are provided in this module, saveData, printAllData and closeMongo. We will grab our previous code and call the functions from this module. We would like to modify the index.js file to make it look like the following.

const superagent = require('superagent')
const fs = require('fs')
const dao = require('./dao')
	.query({ aid:26763233 })
	.then(res => {
    // obtain the data to be saved
		const data =
		dao.saveData(data, (err,res)=>{
			console.log('saved data')
			// save data to MongoDB and close connection
			dao.printAllData( () => dao.closeMongo() )
	.catch(err => console.error(err))

Run the code with the command

node index

and we obtaint he following output:

saved data
[ { _id: 5b41d634f3df89032c834d5b,
    aid: 26186448,
    view: 619751,
    danmaku: 12053,
    reply: 3500,
    favorite: 14961,
    coin: 40699,
    share: 1676,
    like: 20751,
    now_rank: 0,
    his_rank: 1,
    no_reprint: 1,
    copyright: 1 } ]

With this print out, we would confirm that our data is being stored in our MongoDB.

Published: by ;

Current Ref:

  • wiki/nodecrawler/