使用x2cn_ctl在LXC中快速搭建Postgres-X2分布式数据库

Posted on 2017-03-19 16:37:02 by osdba

1. 背景

1.1 x2cn_ctl工具介绍

x2cn_ctl工具是一个安装管理Postgres-X2/XL的工具,是用python语言开发的工具,
目前开源在github上:https://github.com/pgx2cn/x2cn_ctl。 其有以下功能:

  • 简化安装Postgres-X2的过程,通过简单的几个命令就可以完成一个完整的Postgres-X2的安装
  • 通过简单的几个命令就可以建立一个基于lxc、ZFS的Postgres-X2的开发测试环境
  • 一个命令就能完成整个集群的启动和停止
  • 包装psql一键连接到某一个coordinator和datanode
  • 可以方便设置所有datanode或coordinator的数据库参数

本文就以如何在LXC的环境中搭建Postgres-X2集群为例来讲解整个安装过程。

1.2 为什么使用LXC

使用LXC可以在很低的硬件环境中可以搭建出很多的虚拟机,所以比较适合用来搭建Postgres-X2/XL环境。

1.3 LXC中搭建环境的最佳实践

最好使用ubuntu最做宿主机,原因是ubuntu对虚拟化支持会更好一些,包也更多一些,使用centos7.2有,虚拟机启动会比较慢,原因不明。
使用ZFS,原因是clone虚拟机会很快,因为使用了ZFS的clone功能,省空间。

2. 准备环境

2.1安装配置zfs环境

安装zfs相关的包:


add-apt-repository ppa:zfs-native/stable
aptitude install zfs-dkms zfsutils

创建zpool


zpool create pgx2 /dev/sdb1 -m /lxc_pgx2

开启zfs的压缩和去重以节省空间:


zfs set compression=on pgx2
zfs set dedup=on pgx2

2.2 安装配置lxc

2.2.1 安装lxc


aptitude install lxc
aptitude install lxc-templates

2.2.2 配置lxc

修改lxc的配置,以便于把后面创建的lxc虚拟机的数据文件都放到zfs上。
在/etc/lxc/lxc.conf中配置以下这一行:


lxc.lxcpath=/lxc_pgx2

2.2.3 创建一个lxc虚拟机做为模板

创建一个lxc虚拟机,以后Postgres-X2需要的各个虚拟机都从这个虚拟机为模板而clone出来:


lxc-create -n pgx2template -t ubuntu -B zfs --zfsroot=pgx2

默认创建的虚拟机IP地址是动态分配的,我们需要把pgx2template这个虚拟机的IP地址改成静态分配:
把虚拟机中的网络配置文件/lxc_pgx2/pgx2template/rootfs/etc/network/interfaces中dhcp分配地址的那一行:


#iface eth0 inet dhcp

替换为:


iface eth0 inet static
        address 10.0.3.5
        netmask 255.255.255.0
        broadcast 255.255.255.0
        dns-nameservers 114.114.114.114
        gateway 10.0.3.1

为了防止ssh时慢,把/etc/ssh/sshd_config文件中添加此配置项(如果存在则修改):


UseDNS no

2.2.4 打通ssh通道

为了后面的方便管理,我们打通root用户到各个lxc虚拟机的ssh通道,所谓打通ssh通道就是不需要密码就可以直接ssh到这些机器上。而后续创建的lxc虚拟机都是从pgx2template这个模板虚拟机中clone出来的,我们只要打通宿主机到pgx2template的ssh通道,后续创建出来的虚拟机ssh通道就能通。
另为了更方便一些,我们把这些物理机和这些lxc的虚拟机的任意两台机器都能ssh,而不需要密码。有一个技巧是,先让宿主机能ssh自己不需要密码,然后把宿主机上的.ssh目录下的私有key文件id_dsa、id_dsa.pub、authorized_keys都拷贝到pgx2template虚拟机中,这样做之后这些机器之间ssh就可以不需要密码了,下面是操作过程: 在宿主机中,生成root用户的ssh的key:


ssh-keygen -t dsa

然后把/root/.ssh/id_dsa.pub的内容添加到/root/.ssh/authorized_keys中,以便于ssh自己不需要密码:


cat /root/.ssh/id_dsa.pub>>/root/.ssh/authorized_keys

启动pgx2template虚拟机,然后登录到虚拟机中,注意lxc创建出来ubuntu虚拟机,默认用户是ubuntu,密码为ubuntu,进入后需要使用sudo 才能切换到root用户下,第一次不能直接ssh到root用户下的,如下所示:


root@ubuntu01:~# ssh ubuntu@10.0.3.5
ubuntu@10.0.3.5's password:
Welcome to Ubuntu 14.04.3 LTS (GNU/Linux 3.19.0-25-generic x86_64)
 * Documentation:  https://help.ubuntu.com/
Last login: Mon Feb 22 12:08:52 2016 from 10.0.3.1
ubuntu@pgx2template:~$ sudo su -
[sudo] password for ubuntu:
root@pgx2template:~#

然后把root密码改了(我们这里密码改成ubuntu),以后就可以用root用户ssh这台虚拟机了,如下所示:


root@pgx2template:~# passwd
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully

修改虚拟机pgx2template中的ssh配置文件/etc/ssh/sshd_config,以便允许ssh中root用户使用密码可以login,主要是把配置项PermitRootLogin设置成yes,如下所示:


PermitRootLogin yes

设置完后ssh root@127.0.0.1看能否使用root和密码login本机:


root@pgx2template:/etc/ssh# service ssh restart
ssh stop/waiting
ssh start/running, process 638
root@pgx2template:/etc/ssh# ssh 127.0.0.1
root@127.0.0.1's password:

最后把宿主机上/root/.ssh目录整个拷贝到pgx2template虚拟机中:


root@ubuntu01:~# scp -r /root/.ssh root@10.0.3.5:/root/.
root@10.0.3.5's password:
id_dsa                                                                                                                                       100%  668     0.7KB/s   00:00
id_dsa.pub                                                                                                                                   100%  603     0.6KB/s   00:00
known_hosts.old                                                                                                                              100% 2220     2.2KB/s   00:00
authorized_keys                                                                                                                              100%  611     0.6KB/s   00:00
known_hosts

后面测试一下ssh,看是否还需要密码:


root@ubuntu01:~/.ssh# ssh 10.0.3.5
Welcome to Ubuntu 14.04.3 LTS (GNU/Linux 3.19.0-25-generic x86_64)
 * Documentation:  https://help.ubuntu.com/
Last login: Mon Feb 22 12:37:40 2016 from 10.0.3.1
root@pgx2template:~#

2.3 安装Postgres-X2

2.3.1 在pgx2template中安装依赖包

先安装两个工具:


apt-get install aptitude
aptitude install wget

为了加快安装过程,把源改到163上:


cd /etc/apt/
mv sources.list sources.list.backup
wget http://mirrors.163.com/.help/sources.list.trusty
mv sources.list.trusty sources.list
aptitude update

使用源码安装Postgres-X2,所以先安装git工具


aptitude install git

使用git下载源码:


cd /usr/src
git clone https://github.com/pgx2cn/postgres-x2.git

我们这次安装的是稳定分支REL1_2_STABLE,所以把branch切换到REL1_2_STABLE:


cd postgres-x2
git checkout -b REL1_2_STABLE origin/REL1_2_STABLE

安装一些依赖包:


aptitude install zlib1g-dev
aptitude install libreadline6-dev
aptitude install python-dev
aptitude install libperl-dev
aptitude install gcc make
aptitude install bison flex

编译安装:


./configure --prefix=/usr/local/pgx2_1.2 --with-perl --with-python
make
make install
cd contrib
make
make install
cd /usr/local
ln -sf pgx2_1.2 pgx2

3. 安装配置Postgres-X2

3.1 安装x2cn_ctl工具

按github上的文档:https://github.com/pgx2cn/x2cn_ctl/
把x2cn_ctl.py脚本拷贝到宿主机的/usr/local/bin目录下,并改名为/usr/local/bin/x2cn_ctl。 然后把x2cn_ctl的配置示例文件x2cn_ctl.conf拷贝到/etc目录下。然后配置x2cn_ctl.conf文件,内容如下:


[global]
# Postgres-X2的安装目录
pgx2_install_dir = /usr/local/pgx2

# 初使化Postres-X2数据库时,创建出的默认数据库超级用户
g_pgx2_user = pg

# ip地址段
ip_prefix = 10.0.3


# 在ubuntu下,为.profile,而在Rhel和centos下为.bash_profile

profile_name = .profile


#是否使用lxc,如果使用了lxc,使用把此程序部署在lxc的容器机器上
use_lxc = 1

# 指定lxc的虚拟机所在的目录
lxc_path = /lxc_pgx2


#创建其它lxc虚拟机时使用的模板虚拟机
lxc_template = pgx2template

[coordinator]
#coordinator的postgresql.conf中的配置参数
listen_addresses = '*'
max_connections = 100
tcp_keepalives_idle = 10
tcp_keepalives_interval = 20
tcp_keepalives_count = 3
shared_buffers = 16MB
max_wal_senders = 5
wal_keep_segments = 16
wal_level = hot_standby
hot_standby = off
logging_collector = on
track_activities = on
track_counts = on
track_io_timing = off
track_functions = pl
track_activity_query_size = 4096
autovacuum = on
# 注意Postgres-xl不支持min_pool_size
#min_pool_size = 10
max_pool_size = 100

[datanode]
#datanode的postgresql.conf中的配置参数
listen_addresses = '*'
max_connections = 200
tcp_keepalives_idle = 10
tcp_keepalives_interval = 20
tcp_keepalives_count = 3
shared_buffers = 128MB
vacuum_cost_delay = 2
vacuum_cost_limit = 1000
bgwriter_delay = 10ms
bgwriter_lru_maxpages = 1000
wal_level = hot_standby
wal_writer_delay = 10ms
checkpoint_segments = 256
checkpoint_timeout = 15min
max_wal_senders = 5
wal_keep_segments = 256
# Postgres-xl9.2不支持wal_sender_timeout参数
#wal_sender_timeout = 60s
hot_standby = on
log_min_duration_statement = 120000
logging_collector = on
track_activities = on
track_counts = on
track_io_timing = off
track_functions = pl
track_activity_query_size = 4096
autovacuum = on
autovacuum_vacuum_cost_delay = 2ms


[host01]
hostname=gtm
ip=10.0.3.10

have_gtm = 1
gtm_nodename = gtm
gtm_port = 6666
gtm_os_user = gtm
gtm_os_uid = 701
gtm_pgdata = /home/gtm/pgdata

[host02]
hostname=gtmstb
ip=10.0.3.20

have_gtm_standby = 1
gtm_standby_nodename = gtmstb
gtm_standby_port = 6666
gtm_standby_os_user = gtm_standby
gtm_standby_os_uid = 702
gtm_standby_pgdata = /home/gtm_standby/pgdata

[host03]
hostname=dn01
ip=10.0.3.11

have_gtm_proxy = 1
gtm_proxy_nodename = gtmproxy01
gtm_proxy_port = 6666
gtm_proxy_os_user = gtm_proxy
gtm_proxy_os_uid = 703
gtm_proxy_pgdata = /home/gtm_proxy/pgdata

have_coordinator = 1
coordinator_nodename = coord01
coordinator_os_user = coord
coordinator_os_uid = 704
coordinator_port = 6601
coordinator_pooler_port = 6601
coordinator_pgdata = /home/coord/pgdata

have_datanode = 1
datanode_nodename = dn01
datanode_os_user = datanode
datanode_os_uid = 705
datanode_port = 6701
datanode_pooler_port = 6801
datanode_pgdata = /home/datanode/pgdata


[host04]
hostname=dn02
ip=10.0.3.12

have_gtm_proxy = 1
gtm_proxy_nodename = gtmproxy02
gtm_proxy_port = 6666
gtm_proxy_os_user = gtm_proxy
gtm_proxy_os_uid = 703
gtm_proxy_pgdata = /home/gtm_proxy/pgdata

have_coordinator = 1
coordinator_nodename = coord02
coordinator_os_user = coord
coordinator_os_uid = 704
coordinator_port = 6601
coordinator_pooler_port = 6601
coordinator_pgdata = /home/coord/pgdata

have_datanode = 1
datanode_nodename = dn02
datanode_os_user = datanode
datanode_os_uid = 705
datanode_port = 6701
datanode_pooler_port = 6801
datanode_pgdata = /home/datanode/pgdata

此次Postgres-X2集群,会使用4台虚拟机:

  • 第1台放gtm,
  • 第2台放gtm standby
  • 第3台上放一个coordinator节点和一个datanode节点
  • 第4台上放一个coordinator和datanode节点

配置好x2cn_ctl.conf这后,用下面的命令检查一下配置是否正确:


root@ubuntu01:/etc# x2cn_ctl check_config
=== x2cn_ctl V0.1 Author: osdba ===

Check OK.

3.2 创建Postgres-X2所需要的虚拟机

使用x2cn_ctl lxc_all_all命令创建出我们需要虚拟机:


root@ubuntu01:~# x2cn_ctl lxc_add_all
=== x2cn_ctl V0.1 Author: osdba ===

2016-02-22 15:27:51 INFO Begin create contain(gtm)...
could not find any snapshots to destroy; check snapshot names.
Created container gtm as snapshot of pgx2template
2016-02-22 15:27:51 INFO Create lxc contain(gtm) ok.
2016-02-22 15:27:51 INFO Begin modify lxc contain(gtm) ip to 10.0.3.10 ...
2016-02-22 15:27:51 INFO Modify lxc contain(gtm) ip to 10.0.3.10 ok.
...
...

可以看到需要的虚拟机已创建好:


root@ubuntu01:~# lxc-ls -f
NAME              STATE    IPV4       IPV6  AUTOSTART
-----------------------------------------------------
dn01              RUNNING  10.0.3.11  -     NO
dn02              RUNNING  10.0.3.12  -     NO
gtm               RUNNING  10.0.3.10  -     NO
gtmstb            RUNNING  10.0.3.20  -     NO
pgx2template      STOPPED  -          -     NO

注意在创建所需要的虚拟机之前,需要把pgx2template虚拟机停掉,否则会报如下错误:


root@ubuntu01:~# x2cn_ctl lxc_add_all
=== x2cn_ctl V0.1 Author: osdba ===

2016-02-22 15:26:17 INFO Begin create contain(gtm)...
lxc_container: lxccontainer.c: lxcapi_clone: 2635 error: Original container (pgx2template) is running
clone failed

3.3 创建Postgres-X2在各个虚拟机中的操作系统用户

使用x2cn_ctl add_os_user_all就完成了此任务:


oot@ubuntu01:~# x2cn_ctl add_os_user_all
=== x2cn_ctl V0.1 Author: osdba ===

2016-02-22 15:33:45 INFO Begin add user and group to host(10.0.3.10) ...
2016-02-22 15:33:45 INFO Add user and group to host(10.0.3.10) lxc container ok.
2016-02-22 15:33:45 INFO Begin modify /home/gtm/.profile in host(10.0.3.10) ...
Connection to 10.0.3.10 closed.
...
2016-02-22 15:33:46 INFO Modify modify /home/gtm/.profile in host(10.0.3.10) finished.
2016-02-22 15:33:46 INFO Begin add user and group to host(10.0.3.20) ...
2016-02-22 15:33:46 INFO Add user and group to host(10.0.3.20) lxc container ok.
2016-02-22 15:33:46 INFO Begin modify /home/gtm_standby/.profile in host(10.0.3.20) ...
Connection to 10.0.3.20 closed.
...
...

3.4 初使化Postgres-X2数据库


root@ubuntu01:~# x2cn_ctl initdb_all
=== x2cn_ctl V0.1 Author: osdba ===

The files belonging to this GTM system will be owned by user "gtm".
This user must also own the server process.

fixing permissions on existing directory /home/gtm/pgdata ... ok
creating configuration files ... ok

Success. You can now start the GTM server using:
    gtm -D /home/gtm/pgdata
or
    gtm_ctl -Z gtm -D /home/gtm/pgdata -l logfile start

Connection to 10.0.3.10 closed.
...
The files belonging to this GTM system will be owned by user "gtm_standby".
This user must also own the server process.

fixing permissions on existing directory /home/gtm/pgdata ... ok
creating configuration files ... ok
Success. You can now start the GTM server using:

    gtm -D /home/gtm/pgdata
or
    gtm_ctl -Z gtm -D /home/gtm/pgdata -l logfile start

Connection to 10.0.3.20 closed.
...
The files belonging to this GTM system will be owned by user "gtm_proxy".
This user must also own the server process.

fixing permissions on existing directory /home/gtm_proxy/pgdata ... ok
creating configuration files ... ok

Success. You can now start the GTM proxy server using:
    gtm_proxy -D /home/gtm_proxy/pgdata
or
    gtm_ctl -Z gtm_proxy -D /home/gtm_proxy/pgdata -l logfile start

Connection to 10.0.3.11 closed.
...
The files belonging to this GTM system will be owned by user "gtm_proxy".
This user must also own the server process.

fixing permissions on existing directory /home/gtm_proxy/pgdata ... ok
creating configuration files ... ok

Success. You can now start the GTM proxy server using:
    gtm_proxy -D /home/gtm_proxy/pgdata
or
    gtm_ctl -Z gtm_proxy -D /home/gtm_proxy/pgdata -l logfile start

Connection to 10.0.3.12 closed.
...
...
...

然后就可以用下面的命令启动数据库了:


x2cn_ctl start

3.5 注册Postgres-X2的node信息

按Postgres-X2的要求,在初使化完数据库后,需要在各个coordinator上执行类似的SQL把各个节点注册到coordinator中:


alter node coord01 with (host = '10.0.3.11', port= 6601);
create node coord02 with (type = 'coordinator', host = '10.0.3.12', port= 6601);
create node dn01 with (type = 'datanode', host = '10.0.3.11', port= 6701);
create node dn02 with (type = 'datanode', host = '10.0.3.12', port= 6701);

使用x2cn_ctl coord_reg_node命令就完成了此项工作:


x2cn_ctl coord_reg_node

3.6 测试Postgres-X2

到虚拟机10.0.3.11上, su到用户coord下:


root@dn01:~# su - coord
coord@dn01:~$

用psql连接到coordinator上,建一张测试表试一下:


coord@dn01:~$ psql -Upg  -p 6601 -Upg postgres
psql (PGXC 1.2devel, based on PG 9.3.10)
Type "help" for help.

postgres=# create table test01(id int primary key, note text);
CREATE TABLE
postgres=# \d+ test01
                        Table "public.test01"
 Column |  Type   | Modifiers | Storage  | Stats target | Description
--------+---------+-----------+----------+--------------+-------------
 id     | integer | not null  | plain    |              |
 note   | text    |           | extended |              |
Indexes:
    "test01_pkey" PRIMARY KEY, btree (id)
Has OIDs: no
Distribute By: HASH(id)
Location Nodes: ALL DATANODES

4. x2cn_ctl使用示例

4.1 list命令示例

list命令可以显示集群的配置情况:


root@ubuntu01:~# x2cn_ctl list
=== x2cn_ctl V0.1 Author: osdba ===

 hostname       ip       nodetype     nodename     port   os_user                pgdata
----------  ---------  -----------  ------------  -----  ----------  ------------------------------
gtm         10.0.3.10  gtm          gtm            6666  gtm         /home/gtm/pgdata
gtmstb      10.0.3.20  gtm_standby  gtmstb         6666  gtm         /home/gtm/pgdata
dn01        10.0.3.11  gtm_proxy    gtmproxy01     6666  gtm_proxy   /home/gtm_proxy/pgdata
                       coordinator  coord01        6601  coord       /home/coord/pgdata
                       datanode     dn01           6701  datanode    /home/datanode/pgdata
dn02        10.0.3.12  gtm_proxy    gtmproxy02     6666  gtm_proxy   /home/gtm_proxy/pgdata
                       coordinator  coord02        6601  coord       /home/coord/pgdata
                       datanode     dn02           6701  datanode    /home/datanode/pgdata

4.2 status命令示例

status命令可以显示集群各节点数据库的运行情况:


root@ubuntu01:~# x2cn_ctl status
=== x2cn_ctl V0.1 Author: osdba ===

 hostname       ip       nodetype     nodename     port         status
----------  ---------  -----------  ------------  -----  --------------------
gtm         10.0.3.10  gtm          gtm            6666  Running(pid=753)
gtmstb      10.0.3.20  gtm_standby  gtmstb         6666  Running(pid=732)
dn01        10.0.3.11  gtm_proxy    gtmproxy01     6666  Running(pid=1603)
                       coordinator  coord01        6601  Running(pid=1644)
                       datanode     dn01           6701  Running(pid=1620)
dn02        10.0.3.12  gtm_proxy    gtmproxy02     6666  Running(pid=1596)
                       coordinator  coord02        6601  Running(pid=1637)
                       datanode     dn02           6701  Running(pid=1613)

4.3 psql命令示例

此命令是对psql命令做的一个简单包装:

  • 参数-c 1表示连接到第1个coordinator
  • 参数-c 2表示连接到第2个coordinator
  • 依次类推,参数-c n,其中n为1,2,3的数字,表示连接到第n个coordinator
  • 参数-d 1表示连接到第1个datanode
  • 参数-d 2表示连接到第2个datanode
  • 依次类推,参数-d n,其中n为1,2,3的数字,表示连接到第n个datanode

使用示例如下:
连接第1个coordinator


root@ubuntu01:~# x2cn_ctl psql -c 1
=== x2cn_ctl V0.1 Author: osdba ===

psql (PGXC 1.2devel, based on PG 9.3.10)
Type "help" for help.

postgres=#

连接第2个coordinator


root@ubuntu01:~# x2cn_ctl psql -c 2
=== x2cn_ctl V0.1 Author: osdba ===

psql (PGXC 1.2devel, based on PG 9.3.10)
Type "help" for help.

postgres=#

连接第2个datanode


root@ubuntu01:~# x2cn_ctl psql -d 2
=== x2cn_ctl V0.1 Author: osdba ===

psql (PGXC 1.2devel, based on PG 9.3.10)
Type "help" for help.

postgres=#

4.4 set命令示例

set命令用于修改所有coordinator或datanode的数据库参数:
如下面的命令会把集群中所有的coordinator中的postgresql.conf中的max_connections参数改成500:


root@ubuntu01:~# x2cn_ctl set -c -k max_connections -v 500
=== x2cn_ctl V0.1 Author: osdba ===

2016-02-27 18:52:03 INFO Modify node(coord01) postgresql.conf ...
postgresql.conf                                                                                                                              100%   22KB  21.6KB/s   00:00
Connection to 10.0.3.11 closed.
tmpudRuFe                                                                                                                                    100%   22KB  21.6KB/s   00:00
2016-02-27 18:52:03 INFO Modify node(coord02) postgresql.conf ...
postgresql.conf                                                                                                                              100%   22KB  21.6KB/s   00:00
Connection to 10.0.3.12 closed.
tmpObNqB7

x2cn_ctl的set命令的帮助示例如下:


root@ubuntu01:~# x2cn_ctl set
=== x2cn_ctl V0.1 Author: osdba ===

Usage: x2cn_ctl set [options]


Options:
  -h, --help            show this help message and exit
  -l LOGLEVEL, --loglevel=LOGLEVEL
                        Specifies log level:  debug, info, warn, error,
                        critical, default is info
  -d, --datanode        Set configuration of datanode.
  -c, --coordinator     Set configuration of coordinator.
  -k KEY, --key=KEY     Specify key of configuration
  -v VALUE, --value=VALUE
                        Specify key of configuration

从上面可以看出-d 是指定修改datanode的数据库参烽,而-c表示修改coordinator的数据库参数。

4.5 其它的一些命令

其它的一些命令大家可以试用,这里就不再详细讲解决了,祝大家玩的开心。