Ceph is a distributed open source storage solution that supports Object Storage, Block Storage and File Storage.

Other open source distributed storage systems are GlusterFS and HDFS.

In this guide, we describe how to setup a basic Ceph Cluster for Block Storage. We have 25 nodes on our setup. The masternode is a MASS Region and Rack controller. The rest of the nodes are Ubuntu 16.04 deployed through MAAS. The recommended filesystem for Ceph is XFS and this is what is used on the nodes.

This guide is based on the Quick Installation guide from the Ceph Documentation. This guide uses the ceph-deploy tool which is a relatively quick way to setup Ceph, especially for newbies. There is also the Manual Installation, deployment through Ansible and juju.

Prerequisites

Topology

  • 1 deploy node (masternode). MAAS region and rack controler is installed plus Ansible
  • 3 monitor nodes (node01,node11,node24). Ubuntu 16.04 on XFS deployed through MAAS
  • 20 OSD nodes (node02-10,12-23).

Create an Ubuntu user on masternode

It would be of convenience to create an ubuntu user on the masternode. with passwordless sudo access:

$ sudo useradd -m -s /bin/bash ubuntu

Run visudo and give passwordless sudo access to the ubuntu user:

ubuntu  ALL=NOPASSWD:ALL

Generate an SSH key pair for the ubuntu user:

$ ssh-keygen -b 4096
Generating public/private rsa key pair.
Enter file in which to save the key (/home/ubuntu/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/ubuntu/.ssh/id_rsa.
Your public key has been saved in /home/ubuntu/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:t1zWURVk7j6wJPkA3VmbcHtAKh3EB0kyanORVbiiBkU ubuntu@masternode
The key's randomart image is:
+---[RSA 4096]----+
|       .E +**B=*=|
|        ..o==oOo+|
|       .+.o.o=.=.|
|      .. oo.o....|
|       .S..=oo.. |
|        oo += +  |
|       .  o  o o |
|                .|
|                 |
+----[SHA256]-----+

Deploy the /home/ubuntu/.ssh/id_rsa.pub pubkey on all the nodes (append in /home/ubuntu/.ssh/authorized_keys). You could add this pubkey on the MAAS user before deploying Ubuntu 16.04 on the nodes.

Set /etc/hosts

$ for ID in {01..24}; do echo "$(dig +short node${ID}.maas @127.0.0.1) node${ID}.maas node${ID}"; done > nodes.txt

It should look like this:

192.168.10.28 node01.maas node01
192.168.10.29 node02.maas node02
192.168.10.30 node03.maas node03
192.168.10.31 node04.maas node04
192.168.10.32 node05.maas node05
192.168.10.33 node06.maas node06
192.168.10.34 node07.maas node07
192.168.10.35 node08.maas node08
192.168.10.36 node09.maas node09
192.168.10.37 node10.maas node10
192.168.10.38 node11.maas node11
192.168.10.39 node12.maas node12
192.168.10.40 node13.maas node13
192.168.10.41 node14.maas node14
192.168.10.42 node16.maas node16
192.168.10.43 node17.maas node17
192.168.10.44 node18.maas node18
192.168.10.45 node19.maas node19
192.168.10.46 node20.maas node20
192.168.10.47 node21.maas node21
192.168.10.48 node22.maas node22
192.168.10.49 node23.maas node23
192.168.10.50 node24.maas node24

Now you can append the result in /etc/hosts:

$ cat nodes.txt | sudo tee -a /etc/hosts

Ansible setup

Use this setup in /etc/ansible/hosts on masternode:

[masternode]
masternode

[nodes]
node01
node02
node03
node04
node05
node06
node07
node08
node09
node10
node11
node12
node13
node14
node15
node16
node17
node18
node19
node20
node21
node22
node23
node24

[ceph-mon]
node01
node11
node24

[ceph-osd]
node02
node03
node04
node05
node06
node07
node08
node09
node10
node12
node13
node14
node15
node16
node17
node18
node19
node20
node21
node22
node23

Install python on all the nodes

$ for ID in {01..24}
> do
>  ssh node${ID} "sudo apt -y install python-minimal"
> done

Ensure time synchronization of the nodes

Install the theodotos/debian-ntp role from Ansible Galaxy:

$ sudo ansible-galaxy install theodotos.debian-ntp

Create a basic playbook ntp-init.yml:

---
- hosts: nodes
  remote_user: ubuntu
  become: yes
  roles:
     - { role: theodotos.debian-ntp, ntp.server: masternode }

Apply the playbook:

$ ansible-playbook ntp-init.yml

Verify that the monitor nodes are time synchronized:

$ ansible ceph-mon -a 'timedatectl'
node11 | SUCCESS | rc=0 >>
      Local time: Fri 2017-04-28 08:06:30 UTC
  Universal time: Fri 2017-04-28 08:06:30 UTC
        RTC time: Fri 2017-04-28 08:06:30
       Time zone: Etc/UTC (UTC, +0000)
 Network time on: yes
NTP synchronized: yes
 RTC in local TZ: no

node24 | SUCCESS | rc=0 >>
      Local time: Fri 2017-04-28 08:06:30 UTC
  Universal time: Fri 2017-04-28 08:06:30 UTC
        RTC time: Fri 2017-04-28 08:06:30
       Time zone: Etc/UTC (UTC, +0000)
 Network time on: yes
NTP synchronized: yes
 RTC in local TZ: no

node01 | SUCCESS | rc=0 >>
      Local time: Fri 2017-04-28 08:06:30 UTC
  Universal time: Fri 2017-04-28 08:06:30 UTC
        RTC time: Fri 2017-04-28 08:06:30
       Time zone: Etc/UTC (UTC, +0000)
 Network time on: yes
NTP synchronized: yes
 RTC in local TZ: no

Check also the OSD nodes:

$ ansible ceph-osd -a 'timedatectl'

Install Ceph

Install ceph-deploy

On masternode:

$ sudo apt install ceph-deploy

Create a new cluster and set the monitor nodes (must be odd numbered):

$ ceph-deploy new node01 node11 node24

Install ceph on master node and all other nodes:

$ ceph-deploy install masternode node{01..24}

Deploy the monitors and gather the keys:

$ ceph-deploy mon create-initial

Prepare the OSD nodes

Create the OSD directories

Create the OSD directories on the OSD nodes:

$ I=0;
$ for ID in {02..10} {12..14} {16..23}
> do 
>  ssh -l ubuntu node${ID} "sudo mkdir /var/local/osd${I}"
>  I=$((${I}+1))
> done;

Verify that the OSD directories are created:

$ ansible ceph-osd -a "ls /var/local" | cut -d\| -f1 | xargs -n2 | sort
node02 osd0
node03 osd1
node04 osd2
node05 osd3
node06 osd4
node07 osd5
node08 osd6
node09 osd7
node10 osd8
node12 osd9
node13 osd10
node14 osd11
node16 osd12
node17 osd13
node18 osd14
node19 osd15
node20 osd16
node21 osd17
node22 osd18
node23 osd19

Nodes 01, 11 and 24 are excluded because those are the monitor nodes.

Fix OSD permissions

Because of some bug we need to change the OSD directories owneship to ceph:ceph. Otherwise you will get this:

** ERROR: error creating empty object store in /var/local/osd0: (13) Permission denied

Change the ownership of the OSD directories on the OSD nodes:

$ I=0;
$ for ID in {02..10} {12..14} {16..23}
> do 
>   ssh -l ubuntu node${ID} "sudo chown ceph:ceph /var/local/osd${I}"
>   I=$((${I}+1))
> done;

Prepare the OSDs

$ I=0
$ for ID in {02..10} {12..14} {16..23}
> do
>   ceph-deploy --username ubuntu osd prepare node${ID}:/var/local/osd${I}
>   I=$((${I}+1))
> done

Activate the OSDs

For nodes 02 – 10:

$ I=0
> for ID in {02..10} {12..14} {16..23}
> do
>   ceph-deploy --username ubuntu osd activate node${ID}:/var/local/osd${I}
>   I=$((${I}+1))
> done

Deploy the configuration file and admin key

Now we need to deploy the configuration file and admin key to the admin node and our Ceph nodes. This will save us from having to specify the monitor address and keyring every time we execute a Ceph cli command.

$ ceph-deploy admin masternode node{01..24}

Set the keyring to be world readable:

$ sudo chmod +r /etc/ceph/ceph.client.admin.keyring

Test and verify

$ ceph health
HEALTH_WARN too few PGs per OSD (9 < min 30)
HEALTH_ERR clock skew detected on mon.node11, mon.node24; 64 pgs are stuck inactive for more than 300 seconds; 64 pgs stuck inactive; 64 pgs stuck unclean; Monitor clock skew detected 

Out newly build cluster is not healthy. We need to increase Placement Groups. The formula is the number_of_minimum_expected_PGs (30) times the number_of_OSDs (20) and rounded to the closest power of 2:

30x20=500 => pg_num=512

Increase PGs:

$ ceph osd pool set rbd pg_num 512

Now we run ceph health again:

$ ceph health
HEALTH_WARN pool rbd pg_num 512 > pgp_num 64

Still some tweaking needs to be done. We need to adjust pgp_num to 512:

$ ceph osd pool set rbd pgp_num 512

And we are there at last:

$ ceph health
HEALTH_OK

Create a Ceph Block Device device

Check the available storage:

$ ceph df
MapGLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED 
    11151G     10858G         293G          2.63 
POOLS:
    NAME     ID     USED     %USED     MAX AVAIL     OBJECTS 
    rbd      0       306         0         3619G           4 

Now we need to create a RADOS Block Device (RBD) to hold our data.

$ rbd create clusterdata --size 4T --image-feature layering

Check the new block device:

$ rbd ls -l
NAME         SIZE PARENT FMT PROT LOCK 
clusterdata 4096G          2

Map the block device:

$ sudo rbd map clusterdata --name client.admin
/dev/rbd0

Format the clusterdata device:

$ sudo mkfs -t ext4 /dev/rbd0

Mount the blobk device:

$ mkdir /srv/clusterdata
$ mount /dev/rbd0 /srv/clusterdata

Now we have a block device for data that is distributed among the 21 storage nodes.

Here’s is a summary of some useful monitoring and troubleshooting commands for ceph

$ ceph health
$ ceph health detail
$ ceph status (ceph -s)
$ ceph osd stat
$ ceph osd tree
$ ceph mon dump
$ ceph mon stat
$ ceph -w
$ ceph quorum_status --format json-pretty
$ ceph mon_status --format json-pretty
$ ceph df

If you run into trouble contact the awesome folks at the #ceph IRC channel, hosted on Open and Free Technology Community IRC network.

Start over

In case you messed up the procedure and you need to start over you can use the following commands:

$ ceph-deploy purge masternode node{01..24}
$ ceph-deploy purgedata masternode node{01..24}
$ ceph-deploy forgetkeys
$ for ID in {02..11} {11..23}; do ssh node${ID} "sudo rm -fr /var/local/osd*"; done
$ rm ceph.conf ceph-deploy-ceph.log .cephdeploy.conf

NOTE: this procedure will destroy your Ceph cluster along with all the data!

Conclusions

Using ceph-deploy maybe an easy way to get started with Ceph, but it does not provide much customization. For a more fine tuned setup you maybe better with the Manual Installation, even though there is a steeper learning curve.

References

Wiki.js is an elegant looking wiki based on Markdown. It supports LDAP and many more authentication mechanisms. In this guide we describe how to install Wiki.js on Ubuntu 16.04.

Prerequisites

  • An Ubuntu 16.04 instance.

Install curl, Node.js v8.x and build-essential:

# apt -y install curl
# curl -sL https://deb.nodesource.com/setup_8.x | bash -
# apt -y install nodejs build-essential

Install MongoDB v3.4

# apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 0C49F3730359A14518585931BC711F9BA15703C6
# echo "deb [ arch=amd64,arm64 ] http://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/3.4 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-3.4.list
# apt update
# apt -y install mongodb-org

Start MongoDB:

# systemctl start mongodb

Enable MongoDB at startup:

# systemctl enable mongodb

Install git

The version that comes with Ubuntu 16.04 fills the minimum requirements so there is no need to install it from upstream.

# apt -y install git

Install Wiki.js

# mkdir /srv/wiki.js
# cd /srv/wiki.js
# npm install wiki.js@latest

You will get this message:

> Browse to http://your-server:3000/ to configure your wiki! (Replaced your-server with the hostname or IP of your server!)
▐   ⠂    ▌ I'll wait until you're done ;)

Do as the message says. Let the wizard wait until we are done, and open another shell to work with.

Setup nginx

Install Nginx:

# apt -y install nginx

Create this VirtualHost configuration (/etc/nginx/sites-available/wiki.example.com.conf):

server {
    listen      [::]:80 ipv6only=off;
    server_name wiki.example.com;
    return      301 https://$server_name$request_uri;
}
server {
    listen 443 ssl http2;
    listen [::]:443 ssl http2;
    server_name  wiki.example.com;

    ssl_session_timeout 1d;
    ssl_session_cache shared:SSL:50m;
    ssl_session_tickets off;

    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_ciphers "EECDH+ECDSA+AESGCM EECDH+aRSA+AESGCM EECDH+ECDSA+SHA384 EECDH+ECDSA+SHA256 EECDH+aRSA+SHA384 EECDH+aRSA+SHA256 EECDH EDH+aRSA !RC4 !aNULL !eNULL !LOW !3DES !MD5 !EXP !PSK !SRP !DSS";
    ssl_prefer_server_ciphers on;

    ssl_certificate /etc/nginx/ssl/wiki.example.com.crt;
    ssl_certificate_key /etc/nginx/ssl/wiki.example.com.key;
    ssl_trusted_certificate /etc/nginx/ssl/CA.crt;

    location / {
        proxy_set_header Host $http_host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_pass http://127.0.0.1:3000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_next_upstream error timeout http_502 http_503 http_504;
    }
}

Enable the wiki.example.com VirtualHost:

# cd /etc/nginx/sites-enabled/
# ln -s ../sites-available/wiki.example.com.conf
# unlink default

Restart Nginx:

# systemctl restart nginx

Configure Wiki.js

After the installation you will be asked if you wish to run the configuration wizard. Select this and continue:

Yes, run configuration wizard on port 3000 (recommended)

Now browse to http://wiki.example.com/ and follow the installation wizard:

  • Welcome!: Start
  • System Check (if all good): Continue
  • General:
    • Site title: ExampleWiki
    • Host: https://wiki.example.com
    • Port: 3000
    • Site UI Language: English
    • Public Access: Not selected
    • Press: Continue
  • Important Considerations: Continue
  • Database: mongodb://localhost:27017/wiki
  • Database Check: Continue:
  • Paths:
    • Local Data Path: ./data
    • Local Repository Path: ./repo
  • Git Repository: Skip this step
  • Git Repository Check: Continue
  • Administrator Account
    • Administrator Email: admin@example.com
    • Password: MySecretCombination
    • ConfirmPassword: MySecretCombination
  • Finalizing: Start

Enable Wiki.js on startup

# npm install -g pm2
# pm2 startup
# pm2 save

Setup LDAP

This is an optional step for those wishing to integrate Wiki.js in their LDAP infrastructure.

Trust CUT IST ISSUING CA

Connect to the LDAP (AD) server and get all certificates:

openssl s_client -showcerts -connect dcs03ist00.lim.tepak.int:636 | tee ldap.log

Hit ‘Ctrl-C’ to end the command.

The certificate with the ID ‘1’ in ldap.log is the ISSUING CA certificate. Extract the CUT IST ISSUING CA certificate and save it in cut_issuing_ca.crt:

-----BEGIN CERTIFICATE-----
MIIF7DCCA9SgAwIBAgITcgAAAAakujIFDl5tvQAAAAAABjANBgkqhkiG9w0BAQsF
ADBNMQswCQYDVQQGEwJDWTEoMCYGA1UEChMfQ1lQUlVTIFVOSVZFUlNJVFkgT0Yg
VEVDSE5PTE9HWTEUMBIGA1UEAxMLQ1VUIFJPT1QgQ0EwHhcNMTUwMzA5MTAyMjAy
WhcNMjAwMzA5MTAzMjAyWjBeMRMwEQYKCZImiZPyLGQBGRYDaW50MRUwEwYKCZIm
iZPyLGQBGRYFdGVwYWsxEzARBgoJkiaJk/IsZAEZFgNsaW0xGzAZBgNVBAMTEkNV
VCBJU1QgSVNTVUlORyBDQTCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEB
AMPHOqer+ovT+Y99lI4dSYa+XHAnCaup8vbLE9x4iKEjq9gfYq8Gs3Aujx6h32Y8
DLJcKHgxlzqwn6zzq2YSDziFPCka5bAZswaFqvD6fm22oujRmlUBDrW37OsP3nwJ
gT5GMUSzvE0ZvcdjotCm2iBDfGryJgU3PFgAvPXVyFq8bV1jTYBP8sqsWssIdOMg
doN8RLKj6gwJIwA1cvhvnuePU6a2HlHI314GaUqNtyX5PZ5VbUNIQXBt+McPNP0d
sg8yKzfBtl6V4U9xheKmdKxfIB0rN1L/uqzoewIrwUXab7l9kbSYoKqiPZkYnpEi
TQ8msXa/6TL0grQR085sugUCAwEAAaOCAbIwggGuMBAGCSsGAQQBgjcVAQQDAgEA
MB0GA1UdDgQWBBR0NThxWu0JH7JLfLf75h9dis6S6DCBnwYDVR0gBIGXMIGUMIGR
BgsrBgEEAYLiewEBATCBgTBOBggrBgEFBQcCAjBCHkAAQwBlAHIAdABpAGYAaQBj
AGEAdABpAG8AbgAgAFAAcgBhAGMAdABpAGMAZQAgAFMAdABhAHQAZQBtAGUAbgB0
MC8GCCsGAQUFBwIBFiNodHRwczovL3BraS5jdXQuYWMuY3kvcGtpL2Nwcy5odG1s
ADAZBgkrBgEEAYI3FAIEDB4KAFMAdQBiAEMAQTALBgNVHQ8EBAMCAYYwDwYDVR0T
AQH/BAUwAwEB/zAfBgNVHSMEGDAWgBSY/OKi6F/HKZRvR0574L5dNf4KwzA5BgNV
HR8EMjAwMC6gLKAqhihodHRwOi8vcGtpLmN1dC5hYy5jeS9jcmwvQ1VULVJPT1Qt
Q0EuY3JsMEQGCCsGAQUFBwEBBDgwNjA0BggrBgEFBQcwAoYoaHR0cDovL3BraS5j
dXQuYWMuY3kvcGtpL0NVVC1ST09ULUNBLmNydDANBgkqhkiG9w0BAQsFAAOCAgEA
IVJPKacKldP8WEFkCNGJq9PIFr5R0MuD6DUSifnnj94Z004ET29+C3sesUsWHMLd
p1LA+6pmF8JUAUg5Zzab2gV3vHVorYIPlPs2IorhTlnl5k9UCAjBuT3HV1kcCt2R
L601KYIYN3QIiVaA3k7xOPp+Y9nL/OlvjE3S6MVeC2WJYl1Ai+eZWQpL/SvAA8vH
NQzCqqNUnCT1zr9Uei71r0elgtoevsRmWRVWPmOl4Sft0amJ5fXDhYvN0KiktDTt
SeBy+YY48vzbhTRPFhUsj84ePyQcBqtLWQlyhmFs8b+gtSbmYWA1AnWgA+/Fd37Z
k75/x8qqnyZ4BFBKzEzukIKyapnRf8ARzL6oHo7XIXmkUb4cBzf5asNkqRRcUxrl
IEPGuGjr0+yo49f0uP3v+xE+Vyam27B3A4hhrgyqIe26/FxlbkyHT7BLhxCSu7kX
hymb9rZwGy8LR6JVCHhOllXm0eaKiRxFBYcgofekgUfjL5Coip6wAzabtrqE38zX
EkQ6sFi4hFM2ifVKA6lU1OUfeK1j5hEcmGzyCS+4aWXeY9/tTlDj28Z04cp34jla
Oc2Q4VBmO6yA1d6L22o9gTb4gvTcghHOJezWeo5PFsSD+GmJfq4JLvaHk9gsbb5D
099bze3eSDCISoR5ce6mdkD6pr5PfsVfZHhSktc4nk4=
-----END CERTIFICATE-----

Verify the certificate with:

openssl x509 -text -in cut_issuing_ca.crt

Add the CUT ISSUING CA in the trusted chain of the system:

cp cut_issuing_ca.crt /usr/local/share/ca-certificates/
update-ca-certificates

Configure LDAP for Wiki.js

Make these changes in /srv/wiki.js/config.yml:

  ldap:
    enabled: true
    url: 'ldap://ldap.example.com:389'
    bindDn: 'cn=wiki,ou=dsa,dc=example,dc=com'
    bindCredentials: 'MyLDAPCredentials'
    searchBase: 'ou=people,dc=example,dc=com'
    searchFilter: '(uid={{username}})'
    tlsEnabled: true
    tlsCertPath: '/etc/ssl/certs/ca-certificates.crt'

Give Access permissions to authenticated users

Visit the Admin URL:

https://wiki.example.com/admin

Click on ‘Users’. You will get a list of users. You can give ‘Read and Write’ access to them from the ‘Access Rights’ field and you can upgrade them to ‘Global Administrators’ from the ‘Role Override’ field.

NOTE: For LDAP the users need to login first before they are allowed to write.

Enjoy your newly created Wiki!

References