mongo-replica-set

Making-of: docker-based MongoDB replica-set migration from single instance

https://github.com/xverges/mongo-replica-set

Initial situation

Vagrant environment to test

Plan

A Tangent

I was looking into some simple way to setup/activate a virtualenv associated to the project, and went jumping from cool project to cool project: from autoenv to direnv and finally betting on Pipenv: Python Development Workflow for Humans. Thus, the project dependencies are tracked in Pipfile.

Doing. First (half) successful attempt.

Get our python dependencies and the environment variables defined in .env.

$ pipnenv shell
(mongo-replica-set-qvtM3FSm)$

Start the first and second vagrant boxes and their docker-compose (vagrant destroy + vagrant up). Requires installing docker inside the guests, getting the mongodb docker image… takes its time (15 minute-ish on my home network).

(mongo-replica-set-qvtM3FSm)$ ./scripts/01-start-standalone.sh

01-start-standalone.sh

Reset the boxes. Trial an error required getting to this step very often.

(mongo-replica-set-qvtM3FSm)$ ./scripts/reset-standalone.sh all

reset-standalone.sh

Create dbs, collections and documents in the first and second instances

(mongo-replica-set-qvtM3FSm)$ ./scripts/02-feed-standalone.py
(mongo-replica-set-qvtM3FSm)$ ./scripts/read-standalone.py

02-feed-standalone.py read-standalone.sh all

Consolidate all the info on the first.

(mongo-replica-set-qvtM3FSm)$ ./scripts/backup.sh
(mongo-replica-set-qvtM3FSm)$ ./scripts/restore.sh second 192.168.100.10
(mongo-replica-set-qvtM3FSm)$ ./scripts/reset-standalone.sh second
(mongo-replica-set-qvtM3FSm)$ ./scripts/read-standalone.py

backup.sh restore.sh reset-standalone.sh read-standalone.sh

Restart the instances, now with the --replSet param set and allowing to have other hosts that localhost to connect to mongodb. I allowed everything with param --bind_ip 0.0.0.0 .

Note that, when I specified the --replSet param with an empty data directory, the scripts in /docker-entrypoint-initdb.d were not executed.

(mongo-replica-set-qvtM3FSm)$ ./scripts/03-stop-standalone.sh
(mongo-replica-set-qvtM3FSm)$ ./scripts/04-start-with-repl-param.sh

03-stop-standalone.sh 04-start-with-repl-param.sh

The instances are not operative now. If we try to read them…

(mongo-replica-set-qvtM3FSm)$ ./scripts/read-standalone.py
(mongo-replica-set-qvtM3FSm) bash-3.2$ ./scripts/read-standalone.py
----FIRST----
node is not in primary or recovering state
----SECOND----
{'local': {'not_replicated': ()}}

…and the reported error is on track:

(mongo-replica-set-qvtM3FSm) bash-3.2$ ./scripts/get-replicaset-status.sh
Working with first... Mapped to the host port 27110
MongoDB shell version v3.4.4
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.4
{
	"info" : "run rs.initiate(...) if not yet done for the set",
	"ok" : 0,
	"errmsg" : "no replset config has been received",
	"code" : 94,
	"codeName" : "NotYetInitialized"
}
Working with second... Mapped to the host port 27111
MongoDB shell version v3.4.4
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.4.4
{
	"info" : "run rs.initiate(...) if not yet done for the set",
	"ok" : 0,
	"errmsg" : "no replset config has been received",
	"code" : 94,
	"codeName" : "NotYetInitialized"
}

get-replicaset-status.sh

And, all the previous commands in a single line to make trial-and-error faster:

 (mongo-replica-set-qvtM3FSm)$ ./scripts/reset-standalone.sh all && ./scripts/02-feed-standalone.py && ./scripts/backup.sh && ./scripts/restore.sh second 192.168.100.10 && ./scripts/reset-standalone.sh second && ./scripts/03-stop-standalone.sh && ./scripts/04-start-with-repl-param.sh 

Initialize the replica set. This is done in the replicaset-init.js and replicaset-add-additional.js

(mongo-replica-set-qvtM3FSm)$ ./scripts/05-init-replicaset.sh
(mongo-replica-set-qvtM3FSm)$ ./scripts/read-standalone.py

05-init-replicaset.sh

The arbitrer has not been setup, but we can access both instances. Using the previous connection params, that do not specify anything related to the replicaset:

Networking issues

When trying to use the replicaset param when creating MongoClient, I learned that my OSX host can not reach my VirtualBox guests, or my guests reach the hostonly address where I expected my host to be. Lots of googling but nothing helped.

For the record:

$ VBoxManage list hostonlyifs

...

Name:            vboxnet2
GUID:            786f6276-656e-4274-8000-0a0027000002
DHCP:            Disabled
IPAddress:       192.168.100.1
NetworkMask:     255.255.255.0
IPV6Address:
IPV6NetworkMaskPrefixLength: 0
HardwareAddress: 0a:00:27:00:00:02
MediumType:      Ethernet
Wireless:        No
Status:          Up
VBoxNetworkName: HostInterfaceNetworking-vboxnet2
$ ifconfig
...
vboxnet2: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
  ether 0a:00:27:00:00:02
  inet 192.168.100.1 netmask 0xffffff00 broadcast 192.168.100.255
...
$ VBoxManage showvminfo mongo-replica-set_arbitrer_1537298209734_479
...
Guest OS:        Red Hat (64-bit)
...
NIC 1:           MAC: 08002737F846, Attachment: NAT, Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: deny, Bandwidth group: none
NIC 1 Settings:  MTU: 0, Socket (send: 64, receive: 64), TCP Window (send:64, receive: 64)
NIC 1 Rule(0):   name = ssh, protocol = tcp, host ip = 127.0.0.1, host port = 2201, guest ip = , guest port = 22
NIC 1 Rule(1):   name = tcp27112, protocol = tcp, host ip = , host port = 27112, guest ip = , guest port = 27017
NIC 2:           MAC: 080027EA7320, Attachment: Host-only Interface 'vboxnet2', Cable connected: on, Trace: off (file: none), Type: 82540EM, Reported speed: 0 Mbps, Boot priority: 0, Promisc Policy: allow-all, Bandwidth group: none
...
Guest:

Configured memory balloon size:      0 MB
OS type:                             Linux26_64
Additions run level:                 2
Additions version:                   5.1.26 r117224

Guest Facilities:

Facility "VirtualBox Base Driver": active/running (last update: 2018/09/18 19:17:03 UTC)
Facility "VirtualBox System Service": active/running (last update: 2018/09/18 19:17:06 UTC)
Facility "Seamless Mode": not active (last update: 2018/09/18 19:17:03 UTC)
Facility "Graphics Mode": not active (last update: 2018/09/18 19:17:03 UTC)

Connecting to the replica set from a proper network host

The vagrant boxes see each other, and there we can connect to mongo specifying that we are connecting to a replica set:

MongoClient(url_to_local, replicaset=replicaset_name, read_preference=ReadPreference.NEAREST)

We can read and write using this client, from both the box that has the primary and the box that has the secondary.

Working with proper credentials and with the local database

So far the tests have been run using mongo root’s credentials and regular databases. We now need to verify that we can work with less privileged credentials and with the local databases. We will

The tests show that

Wrap up