Monday, June 3, 2013

Playing with MariaDB (MySQL) replication

MariaDB is an open-source replacement for MySQL. MariaDB keeps a so-called binary log of transactions, which keeps track of all updates to the database. It is used for database restoration and replication, but in this tutorial we will focus on replication only. Keeping a binary log also proves useful if you need the ability to do point in time recovery from your latest backup, but we will not see that.

Note: I am using Arch Linux and following the official MariaDB documentation for replication.

Master configuration

master # pacman -S mariadb

Optional - for mytop to work:
master # pacman -S perl-dbd-mysql perl-term-readkey

Optional - auto completion of table, field names, etc. in the client
master # sed -ri -e 's/^no-auto-rehash/auto-rehash/' /etc/mysql/my.cnf

master # systemctl start mysqld

I recommend answering yes to all questions and setting a MySQL root password:
master # mysql_secure_installation

master # systemctl restart mysqld

Try connection:

master # mysql -p


Here is courses.mysqldump, a very simple relational database:

DROP DATABASE IF EXISTS courses;
CREATE DATABASE courses;
USE courses;
DROP TABLE IF EXISTS offering, course, instructor;

CREATE TABLE course (
  id int(10) unsigned NOT NULL AUTO_INCREMENT,
  title varchar(255) NOT NULL,
  PRIMARY KEY (id)
);

CREATE TABLE instructor (
  id int(10) unsigned NOT NULL AUTO_INCREMENT,
  name varchar(255) NOT NULL,
  PRIMARY KEY (id)
);

CREATE TABLE offering (
  course_id int(10) unsigned NOT NULL,
  instructor_id int(10) unsigned NOT NULL,
  PRIMARY KEY (course_id, instructor_id),
  FOREIGN KEY (course_id) REFERENCES course(id)
    ON DELETE CASCADE ON UPDATE CASCADE,
  FOREIGN KEY (instructor_id) REFERENCES instructor(id)
    ON DELETE CASCADE ON UPDATE CASCADE
);


Then insert some sample data:


master # mysql -p courses
MariaDB [courses]> INSERT INTO course VALUES (1,'Basic Linux Programming');
MariaDB [courses]> INSERT INTO instructor VALUES (1, 'Antonio Bonifati');
MariaDB [courses]> INSERT INTO offering VALUES (1,1);

The binary log is active by default:

master # cat /var/lib/mysql/mysql-bin.index
./mysql-bin.000001
./mysql-bin.000002
./mysql-bin.000003
./mysql-bin.000004
master # file /var/lib/mysql/mysql-bin.000001
/var/lib/mysql/mysql-bin.000001: MySQL replication log

It's default size is 1 GB (max_binlog_size):

master # mysqld --verbose --help 2>/dev/null | grep ^max-binlog-size
max-binlog-size 1073741824

As you can see in /etc/mysql/my.cnf, by default MariaDB is configured as a master and mixed binary-logging format is used:


# Replication Master Server (default)
# binary logging is required for replication
log-bin=mysql-bin

# binary logging format - mixed recommended
binlog_format=mixed


On the master, we also need to create an account that will be used by the slave to connect and start replicating. We can reuse a normal user account, and just grant the REPLICATION SLAVE permission, but it is more secure to have a dedicated account (e.g. repl) with this privilege only:

master # mysql -p -e "GRANT REPLICATION SLAVE ON *.* TO 'repl'@'SLAVE_IP' IDENTIFIED BY 'REPL_PWD'; FLUSH PRIVILEGES"


Remember to replace SLAVE_IP with the slave IP address before issuing the above command.

Each master or slave in the same replicating group must have a unique server_id. E.g. I will leave the master server_id at its default value (1) and give a different value to the slave (e.g. 2):

master # mysqld --verbose --help 2>/dev/null | grep ^server-id

server-id 1

Slave configuration

Since I do not have another machine for tests, I have installed another copy of Arch Linux in Virtualbox to serve as a slave. The host is the master server itself (my laptop). On the slave, I installed and secured MariaDB exactly as done before for the master.

Then assign the slave a 32-bit number different from the master (e.g. 2). You do this by editing /etc/mysql/my.cnf, commenting the line

server-id = 1

and uncommenting a similar one a few lines below:

# server-id = 2

Also, since this is a slave, disable the binary log by commenting the following lines:

log-bin=mysql-bin
binlog_format=mixed

Actually you can leave this lines (or luncomment another line log-bin=mysql-bin just below in the section related to slave configuration), if you want to have a chained replication setup. That is this slave will write to a binary log any data modifications that are received from the replication master, in order to act as a master to one or more other slaves.

Another reason to enable binary logging on the slave is to enable incremental backups using the slave.

This is not our case and to save disk space and speed up the slave replication, we will not write a binary log there. I have also deleted the existent binary logs:

slave # rm /var/lib/mysql/mysql-bin.*

You can also purge binary logs before a certain date with a query.


After you save edits to /etc/mysql/my.cnf, you need to restart mysql:


slave # systemctl restart mysqld

We now need to copy the data from master to slave. Let's pretend the master server is busy and there are data manipulation queries like INSERT and UPDATE going on, how do we make sure the data copy is consistent (not half new and half old)?

The solution for InnoDB (the default storage engine in MariaDB) is to use the --single-transaction option of mysqldump. It works by starting a transaction statement to the server before dumping data. Because of the isolation property of transactions, the dump will always be consistent (unless your application contains ALTER, CREATE or similar data definition instructions, which is unlikely). If your tables are big add a --quick option to retrieve one table row at a time rather than retrieving the whole row set and buffering it in memory before writing it out.

Temporary allow master to connect to slave as root:
slave # mysql -p -e "GRANT ALL PRIVILEGES ON *.* TO 'root'@'MASTER_IP' IDENTIFIED BY 'TEMP_PWD'; FLUSH PRIVILEGES"

master # mysqldump -p --master-data --single-transaction -B courses | mysql -h SLAVE_IP -p'TEMP_PWD'
Drop privileges and root access from the master to the slave:
slave # mysql -p -e "DROP USER 'root'@'MASTER_IP'"



If you had more than one database to copy, you would simply add other database names after "courses". You can also run the data copy command the other way around, that is on the slave. Left as an exercise for the reader.


The --master-data options causes the dump to include a query like:

--
-- Position to start replication or point-in-time recovery from
--

CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000004', MASTER_LOG_POS=5815;

This tells the slave the position in the binary log of the master server to start replication from.

In general, for InnoDB tables, mysqldump's --single-transaction option is the way of making an online backup, e.g. locally for all databases:

# mysqldump -A --single-transaction all_databases.sql

Now issue this query on the slave, to tell it what the master is and what account to use for replication:

slave # mysql -p -e "CHANGE MASTER TO
  MASTER_HOST='MASTER_IP',
  MASTER_USER='repl',
  MASTER_PASSWORD='REPL_PWD'"


You can optionally override the default value for MASTER_CONNECT_RETRY (86400). This is the number of times that the slave tries to connect to the master before giving up. A value of 0 means “infinite”; the slave attempts to connect forever.

Replication has not started yet. You have to issue:

slave # mysql -p -e "START SLAVE"

You can now try to add a record in a table of the courses database in the master and you will see the same appears on the slave.


On the slave use:

slave # mysql -p -e "SHOW SLAVE STATUS \G"
Enter password:
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.14.51
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000006
Read_Master_Log_Pos: 245
Relay_Log_File: mysqld-relay-bin.000006
Relay_Log_Pos: 529
Relay_Master_Log_File: mysql-bin.000006
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 245
Relay_Log_Space: 1108
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1

to see the state of replication, while on the master:


master # mysql -p -e "SHOW MASTER STATUS \G"
Enter password:
*************************** 1. row ***************************
File: mysql-bin.000006
Position: 245
Binlog_Do_DB:
Binlog_Ignore_DB:
Note: during replication, a slave server creates several logs that hold the binary log events relayed from the master to the slave, and to record information about the current status and location within the relay log. There are three types of logs used in the process by the slave, one relay log and two status logs: 

The relay log (e.g. mysqld-relay-bin.000006 above for the slave) consists of the events read from the binary log of the master and written by the slave I/O thread. Events in the relay log are executed on the slave as part of the SQL thread.

Note you can convert and display a binary log file (including a relay log, since they have the same structure as a regular binary log) in text mode using mysqlbinlog, e.g.:

slave # mysqlbinlog /var/lib/mysql/mysqld-relay-bin.000006

Anyway, only SQL statements logged are readable in this output (if any). Row changes are not easy to interpret.

The master info log (master.info) is plain text and contains status and current configuration information for the slave's connection to the master. This log holds information on the master host name, login credentials, and coordinates indicating how far the slave has read from the master's binary log. Since it contains the password in clear, this file is only readable by the mysql user and any user in the mysql group (by default none) and of course by root:

slave # cat /var/lib/mysql/master.info
18
mysql-bin.000006
245
192.168.14.51
repl
REPL_PWD
3306
60
0





0
1800.000

0


0





Values in master.info can be changed using an SQL query: CHANGE MASTER TO. It allows to set the master host to use and the position in the binary log to start replication from (e.g. the one you get from a SHOW MASTER STATUS query on the master):

slave # mysql -p -e "CHANGE MASTER TO
    MASTER_HOST=MASTER_IP,
    MASTER_USER='repl',
    MASTER_PASSWORD=REPL_PWD,
    MASTER_LOG_FILE='mysql-bin.000004',
    MASTER_LOG_POS=2012"

You better not set these options in my.cnf like:


master-host = MASTER_IP
master-user =  repl
master-password = REPL_PWD

because that is currently not supported (it will probably be in the future, I hope).


The relay log info log is also simple text and holds status information about the execution point within the slave's relay log.


slave # cat /var/lib/mysql/relay-log.info
./mysqld-relay-bin.000006
529
mysql-bin.000006
245

04
With that background, here is an explanation of some of the fields in the output of SHOW SLAVE STATUS.

Field Relay_Master_Log_File (mysql-bin.000006) is the name of the binlog on the master containing the last SQL statement successfully executed on the slave. It is not the name of a relay log on the slave, but a binary log on the master.

master # mysqlbinlog /var/lib/mysql/mysql-bin.000006

Exec_Master_Log_Pos (245) is the position in the Relay_Master_Log_File that the slave SQL thread has executed up to. So in your example, the slave db has executed all statements up until binlog mysql-bin.000006 pos 245 on the master db. You see this value is the same you see in SHOW MASTER STATUS in the Position column.


The tuple (Relay_Master_Log_File, Exec_Master_Log_Pos) expresses the coordinates in the master binary log indicating how far the slave SQL thread has executed events received from that log.


What is the difference between Relay_Master_Log_File and Master_Log_File? (Master_Log_file, Read_Master_Log_Pos): Coordinates in the master binary log indicating how far the slave I/O thread has read events from that log.

Finally (Relay_Log_File, Relay_Log_Pos) is the coordinates in the slave relay log indicating how far the slave SQL thread has executed the relay log. These correspond to the preceding coordinates, but are expressed in slave relay log coordinates rather than master binary log coordinates.



until_log_pos is really only used if you start your slave with the query "START SLAVE UNTIL master_log_pos = integer". That syntax will replicate up to that position (it will be the exec_master_log_pos) and then stop. You would normally only do this if you wanted to replicate to a specific point, but no further (like if the next statement is an accidental table drop or something). The value of until_log_pos is 0 when it is not specified, so in our case means that replication will just keep moving forward.

You can now try to shut down MySQL on the slave:

slave # systemctl stop mysqld

Insert a new course on the master:

master # mysql -p courses
MariaDB [courses]> INSERT INTO course VALUES (3, "C Programming");
MariaDB [courses]> INSERT INTO offering VALUES (3, 1);

Then start the slave again:

slave # systemctl start mysqld

And check you have this new data:

slave # mysql -p courses
MariaDB [courses]> SELECT * FROM course WHERE id=3;
+----+---------------+
| id | title         |
+----+---------------+
| 3  | C Programming |
+----+---------------+


MariaDB [courses]> SELECT * FROM offering;
+-----------+---------------+
| course_id | instructor_id |
+-----------+---------------+
|         1 |             1 |
|         3 |             1 |
+-----------+---------------+


So replication resumes automatically. You usually won't disconnect a slave on purpose, but there is a case where this makes sense: if you want to back up the database consistently. In this case you disconnect the slave and take the backup from it. You can simply copy or rsync all files in /var/lib/mysql/* recursively. This ensures all database will be backed up at the same instant in time.


What happens if you write to the slave, which is supposed to be used as read-only? Go on and make a query like this:

slave # mysql -p -e "INSERT INTO instructor VALUES(2,'Duffy Duck')" courses

You see there is nothing to prevent you from writing to the slave if you connect with a user that has such privilege:

slave # mysql -p -e "SELECT * FROM instructor" courses
Enter password:
+----+------------------+
| id | name             |
+----+------------------+
|  1 | Antonio Bonifati |
|  2 | Duffy Duck       |
+----+------------------+


Normally you would prevent writes to the slave by not having users with CREATE, UPDATE, etc. permission, but if you really want to be sure that no user other than those with the SUPER privileges and the replication can write, set the read_only global variable:


slave # mysqld --verbose --help 2>/dev/null | grep ^read-only
read-only FALSE


You can set it permanently in /etc/mysql/my.cnf (under the [mysqld] section):

read_only = 1

read_only is a dynamic variable, meaning a change does not require a full restart of the server with systemctl restart mysqld. It's value can be changed dynamically while the server is running, so to make effective the change to my.cnf, you either restart or just do:

slave # mysql -p -e 'FLUSH TABLES WITH READ LOCK; SET GLOBAL read_only = ON; SHOW VARIABLES LIKE "read_only"'

FLUSH TABLES WITH READ LOCK locks all tables to make sure that nobody can write to databases (e.g. by doing an INSERT or UPDATE). Once the lock is acquired, we can safely set the server as read-only for clients. The lock will be automatically releases at the end of the session, that is when the mysql command returns.

After that try:

slave # mysql -p -e "INSERT INTO instructor VALUES(3,'Wiley E. Coyote')" courses

you see that it succeeds, because you are root. But if you create another user with write access and no SUPER privilege, you will get an error:

slave # mysql -p -e 'GRANT ALL PRIVILEGES ON courses.* TO ant@localhost IDENTIFIED BY "ant_pwd"'



slave # mysql -uant -p -e "INSERT INTO instructor VALUES(4,'Porky Pig')" courses
ERROR 1290 (HY000) at line 1: The MariaDB server is running with the --read-only option so it cannot execute this statement

So it is your responsibility to make sure the slave databases are consistent. MySQL does not care. It provides replication only, not cloning.

2 comments:

Anonymous said...

Hello Antonio, great write-up. I followed it almost completely and I only had one issue: the default server_id is 0, not 1, so before I changed that on the master the slave was throwing an error:

Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Misconfigured master - server id was not set'

Thanks!

Farmboy said...

Hi, I am running mariadb-5.5.36 and the default server_id is still 1, at least in Arch Linux. Anyway, I am no more very much into computers, since I decided to become a farmer. If you think there is something wrong in the article let me know exactly what to correct and I will be do it gladly.