User:DanielRenfro/MySQL
This is a page about mysql for the Hu laboratory. This page used to be specifically about replication, but replication has been moved into a subsection and other sections added.
Usage
editStarting and Stopping
editI think there are about 1.2e5 different ways to start and stop mysql, some of which are:
- using mysqld_safe script
- Can be run to start a
mysqld
daemon. mysqld_safe adds some safety features such as restarting the server when an error occurs and logging runtime information to an error log file.
- Can be run to start a
- using mysqldadmin
- not used to start
mysqld
daemons, but can be used to do other things like flush-tables, start/stop replication, shutdown. always good in a pinch.
- not used to start
- using launchctl (for macs)
- using the launchctl wrapper:
sudo /opt/local/etc/LaunchDaemons/org.macports.mysql5/mysql5.wrapper [start|stop]
- --or--
- using launchctl directly:
sudo launchctl load -w /opt/local/etc/LaunchDaemons/org.macports.mysql5/org.macports.mysql5.plist
- this method is typically used by the system and not used by the system-admin
- using the launchctl wrapper:
- using mysql.server script (for linux)
- similar to the aforementioned launchctl method, but for systems that use V-style run directories
- using mysqld_multi[5]
- for starting one or more daemons
General Documentation
editThese are things that are good to know, but didn't fit in other sections.
Programs/Utilities
editmysql
editmysqld_safe
editmysqld_multi
edit- This program is a
perl
script for starting/stopping multiple daemons at the same time. - It logs to
/opt/local/share/mysql5/mysqld_multi.log
(for macports) - mysqld_multi at dev.mysql.com
mysql.server & mysqld_multi.server
edit- Scripts that are typically put into /etc/init.d (like on Linux systems) so that the MySQL server(s) are started automatically upon system start.
- For Macintosh computers, I don't think these are relevant, as Macintosh has migrated to the
launchd
system of starting/stopping processes when the system boots. - See
/Library/LaunchDaemons/org.macports.mysql5.plist
, and/or/opt/local/etc/LaunchDaemons/org.macports.mysql5/org.macports.mysql5.plist
(the first might be a soft-link to the latter)
- For Macintosh computers, I don't think these are relevant, as Macintosh has migrated to the
my_print_defaults
editmysql_config
editMacPorts
editThese might be helpful one day:
- https://trac.macports.org/wiki/howto/MAMP#mysql
- https://trac.macports.org/wiki/howto/MySQLWorkbench
- Snow Leopard: http://stackoverflow.com/questions/4798464/installing-mysql-on-snow-leopard-using-macports
- http://hennessynet.com/2008/01/15/installing-macports-and-mysql-on-leopard.html
Paths
editMacports installs MySQL in the following places:
/opt/local/etc/my.cnf /opt/local/bin/mysql*5 /opt/local/var/db/mysql5/ /opt/local/lib/mysql5/ ... probably lots of other places
Gotchas
editThe difference between % and localhost
editWhen setting up users, you can choose between three options for the host:
- %
- This means any user connecting over TCP/IP.
- localhost
- This means any user connecing via the socket - which means they must be on this computer.
- This is totally different than connecting via 127.0.0.1 (which would use the permissions set for user@%)
string - hostname
- A specific host, over TCP/IP
When does it read my.cnf
edit
It seems like sometimes the mysql binaries read the configuration directives that are stored in [/opt/local]/etc/my.cnf
and sometimes they don't. I haven't figured out why this is; maybe just some programs/scripts know about my.cnf.
Locking
edittypes
edit- shared locks / read locks
- many clients can read a shared-locked resource
- exclusive locks / write locks
- only one client can access/change this
Storage Engines
editAttribute | MyISAM | InnoDB |
---|---|---|
transactions | no | yes |
lock-granualrity | table | row |
storage | split files (3 of them) | tablespace (possibly one file) |
referential integrity | no | yes |
primary key with data | no | yes |
MySQL caches data records | no | yes |
full-text indices | yes | no |
select count(*) | fast (keeps a count on-hand) | slow (must count each time) |
Replication
editIt is important to understand how MySQL replication works. There is a lot of documentation on the web, but here are a few notes I have made through my process of learning:
This process is as simple as:
- configure the master
- configure the slave
- start both servers
- connect the slave to the master
Russell Dyer, author of MySQL in a Nutshell explains on this website the rational for having a simple replication setup.
One of the difficulties with a large and active MySQL database is making clean backups without having to bring the server down. Otherwise, a backup may slow down the system and there may be inconsistency with data, since related tables may be changed while another is being backed up. Taking the server down will ensure consistency of data, but it means interruption of service to users. Sometimes this is necessary and unavoidable, but daily server outages for backing up data may be unacceptable. A simple alternative method to ensure reliable backups without having to shut down the server daily is to set up replication for MySQL.
For the Hu-laboratory we're going to set up a single master and slave pair on the same server. This means we're dealing with two issues:
- replication between master & slave
- running concurrent
mysqld
processes at the same time
Chapter 1 - Why Replication?
editWhy would you want to set up replication?
- High availability
- In case of catastrophe or system-failure, you can (automagically) fall back to a slave.
- Load Balancing
- For large systems you can spread the load out over several machines/daemons, thereby increasing efficiency and speed.
- Off-site processing
- Don't disturb the master server when doing database-intensive processing.
For us it is a matter of availability. We want to have one database (the master) that is always available, instead of having a long period of latency for backups. With this system we can backup the slave at any time without affecting what goes on with the production sites.
Chapter 2 - Replication Basics
editIn this tutorial we will be using a single master server and a single slave server. The master is the control-point of replication and all changes should happen on the master and get "pushed" to the slave. (In actuality, the slave will ask for changes and execute them.) This is not a synchronous mechanism - meaning the changes to the slave might not happen in the same order that they took place on the master. In theory this sounds bad, but in practice it turns out to be alright. Synchronous replication is slower and has higher fidelity, but is harder to setup. For our purposes using the more simple method is sufficient.
Essentially the replication process consists of these steps:
- an SQL query is sent to the MASTER. MySQL records this query in the bin.log file along with the log position number
- the SLAVE database checks (through an IO process) the bin.log file for any changes
- when a change is encountered the SLAVE copies the SQL statement to it's relay.log file, along with the position number and records this number in the master.info file
- the SLAVE executes the SQL andqueries the MASTER
- the SLAVE checks the results of both queries for consistency and issues any errors if necessary
Chapter 3 - The bin-log
file(s)
edit
The binary log is a set of files that keep track of what changes have been made on the master server so that the same changes can be made on the slave. Here's a brief section from MySQL High Availability (available through the library as an e-book):
The previous chapter included a very brief introduction to the binary log. In this chapter, we will fill in more details and give a more thorough description of the binary log structure, the replication event format, and how to use the mysqlbinlog tool to investigate and work with the contents of binary logs.
The binary log records changes made to the database so that the same changes can be made on any of the slaves as well. Since the binary log normally keeps a record of all changes, you can also use it for auditing purposes to see what happened in the database, and for PITR by playing back the binary log to a server, repeating changes that were recorded in the binary log.
The binary log contains only statements that could change the database. Note that statements that do not change the database but that could potentially change the database are logged. The most notable statements are those that optionally make a change, such as DROP TABLE IF EXISTS or CREATE TABLE IF NOT EXISTS, along with statements such as DELETE and UPDATE that have WHERE conditions that don’t happen to match any rows.
SELECT statements are not normally logged, since they do not make any changes to any database. There are, however, exceptions.
Transactions on a server are not normally executed in sequence, one after the other, but are rather interleaved and executed in parallel. To ensure that two transactions do not conflict and generate an inconsistent result, the server ensures the transaction execution is serializable, meaning the transactions are executed in such a way that the execution yields the same result as if they were executed in a serial order—that is, in a fixed order, one transaction after another.
The binary log records each transaction in the order that the commit took place on the master. Although transactions may be interleaved on the master, each appears as an uninterrupted sequence in the binary log, the order determined by the time of the commit.
So, we'll need to configure the master/slave pair to use the bin-log.
Chapter 4 - Configuration
editReplication Settings
editOn the master we will need a few options configured in the my.cnf
file...usually found at /etc/my.cnf
. The necessary options needed are:
log-bin server-id
and optionally:
log-bin-index binlog-do-db binlog-ignore-db
The explanation for each parameter is as follows:
- log-bin
- the base name for all the files created by the binary log
- log-bin-index
- gives the name of the binary log index file, which keeps a list of all bin-log files
- defaults to whatever the log-bin basename is
- server-id
- each server must have a unique id for identification purposes
- binlog-do-db, binlog-ignore-db
- a list of databases to replicate or skip, respectively.
On the slave all we really need is:
server_id
and optionally:
replicate-do-db replicate-ignore-db replicate-do-table replicate-ignore-table read-only log-slave-updates skip-slave-start ...
I think these options are pretty self-explanatory. If you want to read more, see the Replication and Binary Logging Options and Variables documentation section on the MySQL website.
concurrent mysqld
processes
edit
At this point let's take a moment to focus on setting up the two mysqld
servers without worrying about replication. Running Multiple MySQL Instances on One Machine has some good information. At this point we have a few options:
Compiling different binaries for each daemon we want to run with different settings.(seems like overkill, and I don't want to have to keep up with many different binaries.)- Instantiating the same binary many times with different settings:
set on the command line- we would need a shell script to instantiate both daemons with different options because doing it manually is too much work and I forget things too easily
- we can't set all the options we want on the commandline
- set in option files
- we would need one option file for each daemon running
or by setting environment variables(not permanent enough for me)
Each daemon will need unique settings for the following things:
- --datadir
- This is the most important! This path is where each daemon will keep its data. We can put each directory on a different physical disk to spread out the diskIO, or not.
- --port / --bind-address
- The port this daemon is bound to.
- If we're going move the slave to another machine we need to substitute --bind-address for --port.
- --socket
- Path to the local unix socket.
--shared-memory-base-name- Used only on Windows
- --pid-file
- Path to the file which the daemon records it's process id.
In order to increase performance, it is suggested to spread the disk IO load over separate disks with this option:
- --tmpdir
See http://dev.mysql.com/doc/refman/5.1/en/multiple-servers.html for more information.
Chapter 6 - A note about managing multiple servers
editThere are some scripts packaged with mysql that can help us have concurrent processes:
The mysqld_multi
script is (as far as I can tell) a very basic wrapper around mysqld_safe
. It loops and starts each server you tell it to.
mysqlmanager
- Also you can manipulate/start/stop/etc. each server with
mysqladmin
-- assuming you give the script the correct socket so it knows which server to connect to.
Starting
editAfter fiddling with this for a while, I have found the mysqld_multi
script adequate for starting servers. The command goes something like this:
sudo mysqld_multi start
This will start all servers in the my.cnf
file (which means both master and slave in our case.) Optionally you can put a "group number" as the last parameter. See the docs for an explanation.
Reporting what's going on
editEither try:
sudo mysqld_multi report
or, (which I do a lot):
ps aux | grep mysqld
and sort through the processes.
root
and mysql
user, as well as processes for both the mysqld
binary and the mysqld_safe
script.Stopping
editThe mysqld_multi
binary should stop the servers just like starting them, but I've found that sometimes it doesn't (for unknown reasons.)
If this doesn't work:
sudo mysqld_multi stop
then try this (you'll have to stop each server independently):
sudo mysqladmin --socket=/var/run/mysqld/mysql.sock shutdown sudo mysqladmin --socket=/var/run/mysqld/mysql-slave.sock shutdown
If nothing works, you can kill
the processes, but I'm afraid that might interrupt a transaction in the database.
Connecting
editIf you want to connect to the mysql server, you have two options:
- using the
mysql
command-line interface - using phpMyAdmin through the web-browser
command line
editIf you want to use the command line you can connect to the master like this:
mysql --user=$USERNAME --password=XXXXXX
or if you want to be prompted for a password
mysql --user=$USERNAME -p
You can simplify some of this by creating a ~/.my.cnf
file that holds your settings. You can read more at MySQL's option files page. Here is a short sample:
# personal settings for MySQL programs [client] user = daniel password = XXXXXXXXXXXXX [mysql] auto-rehash line-numbers pager = more
Connecting to the slave requires the --socket
flag:
mysql --user=$USERNAME -p --socket=/var/run/mysq/mysql-slave.sock
In practice normal users should never need to log onto the SLAVE database, the backup-bot should be the only person to ever log-on. If you don't have a good reason for connecting to the slave, don't.
phpMyAdmin
editI think most of the servers have phpMyAdmin installed. Have a look at the Servers section on the Hardware page and then on the server's page -- hopefully there is a URL to phpMyAdmin. Also ask.
Chapter 7 - Getting things running
editSo far we have covered the background of replication, learned about options, and written a my.cnf
configuration file but have not actually set anything up. Now comes the fun part - implementation.
The process can be broken down into sizable pieces:
- set up the directories & permissions
- configure the servers (if not already done)
- start the servers
- add a dedicated replication user to the master (with correct permissions)
- lock the master
- dump the master to an SQL file
- load the slave from SQL file
- connect the slave to the master
- start slave
- unlock master
- debug/test
Backups
editLocations
editScripts
editVersion 1
edit- the first script written only dumped two databases. not sufficient for reliable backups.
#!/bin/bash
# ---------- options ------------------------
DBuser="XXXXX"
DBpassword="YYYYYYY"
DUMPDIR="/path/to/dumps.txt"
TIMESTAMP=`date +%a`
# ----------- backup colipedia ------------------
ewDBname="foo"
ewDUMPFILE="foo.$TIMESTAMP.sql"
# echo "Dumping $DBname to ... $DUMPFILE"
mysqldump $ewDBname -u $DBuser -p$DBpassword > $DUMPDIR/$ewDUMPFILE
# modified to use gzip instead of tar
gzip -c $DUMPDIR/$ewDUMPFILE > $DUMPDIR/$ewDUMPFILE.gz
# remove the huge sql file.
rm $DUMPDIR/$ewDUMPFILE
# ----------- backup GONUTS ------------------
goDBname="bar"
goDUMPFILE="bar.$TIMESTAMP.sql"
#echo "Dumping $DB2name to ... $DUMPFILE2"
mysqldump $goDBname -u $DBuser -p$DBpassword > $DUMPDIR/$goDUMPFILE
gzip -c $DUMPDIR/$goDUMPFILE > $DUMPDIR/$goDUMPFILE.gz
rm $DUMPDIR/$goDUMPFILE
Version 2
edit- this version checks (all) the databases and repairs them if necessary. run-time is about 6 hours, which is a little long.
#!/bin/bash
# --------------------------------------------------------------------
# This is a free shell script under GNU GPL version 2.0 or above
# Copyright (C) 2011
# --------------------------------------------------------------------
# Adapted for use by the Hu Laboratory by:
# Daniel Renfro - Jan 2011
# <bluecurio@gmail.com>
# --------------------------------------------------------------------
# some basic variables
USERNAME="XXXXXXx" # USERNAME
PASSWORD="YYYYYYYYY" # PASSWORD
HOST="localhost" # Hostname
# Linux bin paths, change this if it can not be autodetected via which command
MYSQL="$(which mysql)"
MYSQLDUMP="$(which mysqldump)"
MYSQLCHECK="$(which mysqlcheck)"
GZIP="$(which gzip)"
# Backup Dest directory, change this if you have someother location
DEST="/path/to/dumps.txt"
# Get hostname
#HOST="$(hostname)"
HOST='localhost'
# Get date in dd-mm-yyyy format
NOW="$(date +"%d-%m-%Y")"
# File to store current backup file
FILE=""
# Store list of databases
DBS=""
# DO NOT BACKUP these databases
SKIP="test"
# Get all database list first
if [ "$DBS" == "" ];
then
# you can get a subset by changing the query like so:
# DBS="$($MYSQL -u $USERNAME -h $HOST -p$PASSWORD -Bse 'show databases like "subtilis%"')"
# or you can get ALL the databases
DBS="$($MYSQL -u $USERNAME -h $HOST -p$PASSWORD -Bse 'show databases')"
fi
for db in $DBS
do
skipdb=-1
if [ "$SKIP" != "" ];
then
for i in $SKIP
do
[ "$db" == "$i" ] && skipdb=1 || :
done
fi
if [ "$skipdb" == "-1" ] ; then
echo $db;
# check the database for consistency (time-consuming, but safer.)
$MYSQLCHECK -u $USERNAME -h $HOST -p$PASSWORD --check-only-changed --optimize --auto-repair --medium-check $db
# set a valid filename
FILE="$DEST/$db.$HOSTNAME.$NOW.gz"
# do all in one job in pipe,
# connect to mysql using mysqldump for select mysql database
# and pipe it out to gz file in backup dir :)
$MYSQLDUMP -u $USERNAME -h $HOST -p$PASSWORD $db | $GZIP -9 > $FILE
fi
done
Slow Queries
editThe slow query log usually is defined in /etc/my.cnf
, and typically lives in /var/log/mysql/
. It can be analyzed using the mk-query-digest from the maatkit software suite.
mk-query-digest /var/log/mysql/mysql-slow.log > mk-query-digest.slow.2011-07-12.out
Notes
editSandbox
editThe MySQL Sandbox stuff looks interesting. I installed this via cpan.
You can make a single sandbox:
cd ~daniel/working/mysql/sandbox/ make_sandbox ./mysql-5.1.51-osx10.5-x86.tar.gz
Or a replication sandbox
cd ~daniel/working/mysql/sandbox/ make_replication_sandbox ./mysql-5.1.51-osx10.5-x86.tar.gz
Links
editHelpful links: