RethinkDB: A NoSQL Database for Real-Time Applications

0
5492

RethinkDB is an open source, scalable database that makes building real-time apps dramatically easier.

With the heterogeneous as well as unstructured data on the World Wide Web (WWW) increasing in leaps and bounds, traditional database engines are facing numerous issues related to schema management, concurrency, database integrity, security, parallel read-write operations, resource optimisation and many others. Real-time applications receive lots of traffic from different channels including social media, mail groups, satellites, the Internet of Things (IoT), etc. Such traffic is generally unstructured and voluminous, and cannot be handled by classical relational database management systems (RDBMS). To cope with such issues of Big Data, which include velocity, volume and variety (the 3Vs of Big Data), Not Only SQL (NoSQL) databases that could handle unstructured and heterogeneous data with higher performance came into existence.

Today, there are different types of NoSQL databases in diverse categories to help specific applications achieve a higher degree of performance and accuracy (Table 1).

RethinkDB as a document-oriented NoSQL database

RethinkDB (https://www.rethinkdb.com) is a high-performance NoSQL database for real-time applications with an effective Web based administrative panel and user interface. The database system provides excellent features for document storage based applications, and is termed as a document-oriented database. RethinkDB provides features for storage, retrieval and management of semi-structured data to achieve the minimum delay in database operations. The key concept and paradigm behind this document-oriented database is the storage and handling of documents in unstructured or semi-structured formats. All document formats can be encapsulated with the encoding of data in document-oriented NoSQL databases. The encoding of data includes YAML, BSON, XML and JSON, along with the other binary forms including PDF, MS Word, spreadsheets and many others.

Figure 1: Official portal of RethinkDB
Figure 2: Starting the RethinkDB server
Figure 3: Dashboard and control panel of RethinkDB

The real-time push architecture of RethinkDB

Real-time Web applications consume resources in parallel from different sources, including database engines, live update scripts, bandwidth, concurrent connections, security modules, etc. To access and update real-time data in Web applications, there is a need to integrate high performance technologies so that the live streaming data can be inserted or updated with minimum delay and resource consumption. For example, live polling or live chat based applications may be delayed because of a high number of parallel users on the same channel, and this can increase the waiting time. With RethinkDB, this delay can be decreased as it has been developed specifically for real-time Web applications, with the required scalability and ease of use.

RethinkDB is the first database to be scalable as well as open source (with the base of JSON) for real-time Web applications. Using RethinkDB, the developer can direct the database engine to push the updated query results in real-time to the running application, so that maximum updates can be done without delay. The real-time push architecture of RethinkDB drastically reduces the effort, time and resources for the real-time applications.

The architecture of RethinkDB is highly advantageous in real-time Web applications, high traffic e-commerce applications, real-time mobile apps, sharing information between connected devices, the Internet of Things (IoT), the Cloud of Things (CoT), multi-player real-time games, satellite channels and wireless signal analytics.Features and advantages of RethinkDB

RethinkDB has a flexible and high-performance query language, ReQL, to provide real-time push based updates to multiple users in parallel. In addition, it provides intuitive operations with the APIs for monitoring real-time changes, which are easy to set up and understand for programmers. It is being used by assorted corporate giants, Fortune 500 companies, as well as startups, to manage live applications.

RethinkDB has a vibrant as well as large community of more than 100,000 software developers and troubleshooting experts across the world. These developers and contributors keep on customising the code and sharing their work on online portals to enrich the overall community of RethinkDB programmers.

Some of the key users of RethinkDB are:

  • Jive Software
  • Narrative Clip
  • Mediafly
  • Pristine.io
  • CMUNE
  • NodeCraft
  • Platzi
  • Workshape.io

Installation of RethinkDB

The official packages of RethinkDB for Ubuntu Linux, OS X, CentOS, Debian and Windows are available. In addition, community-supported packages are available on rethinkdb.com for Arch Linux, OpenSUSE, Fedora, Linux Mint, Raspbian and Gentoo.

Official RethinkDB client drivers are available for JavaScript, Ruby, Python and Java, while community-support drivers are available for C#, Clojure, Delphi, Go, Lua, PHP, Swift, R, Common Lisp, Elixir, Haskell, Nim, C++, Dart, Erlang, JS Neumino, Perl and Rust. Drivers with limited features are also available for Objective-C and Scala.

Figure 4: Creating a new database
Figure 5: View the database after creation
Figure 6: Adding a table in a specific database

Installing RethinkDB on Windows

The 64-bit binaries of RethinkDB for Windows 7 and its versions are available on the official URL of RethinkDB https://www.rethinkdb.com/docs/install/windows/. To install RethinkDB on Windows, the 64-bit flavour is required.

A ZIP archive is downloaded and unpacked in any directory. In the directory of RethinkDB, there is a file called rethinkdb.exe. On double-clicking it, the server gets initiated and is then available for further database operations.

After starting the RethinkDB server, the administration console needs to be opened on the Web browser with the URL http:// 127.0.0.1:8080, so that different options can be viewed for database administration.

The options for creating the database and tables with different privileges are available in the administrative panel of the RethinkDB server. The connected servers, tables, indexes and resources can be viewed in the panel, as shown in Figure 3.

To use a specific directory for the storage and logging of data, the following instruction is used:

WindowsDirectory:\>rethinkdb.exe -d c:\RethinkDB\data\

To specify a particular server name and cluster, use the command given below:

WindowsDirectory:\>rethinkdb.exe -n MyDomain -j mycluster.server.com

In the administrative console, there is the option to create a new database with any name. This database name can be called in the front-end applications for real-time applications.

Once the database is created, it is visible in the administration console. In a similar way, the table in a particular database can be created with the specification of a primary key, along with the acknowledgement.

After creating the table in a specific database, the RethinkDB engine returns the message in JSON format so that the confirmation of the write operation can be displayed with different values inserted in the database table.

Figure 7: Returned message after creating a table on the dashboard of RethinkDB
Figure 8: Installation of RethinkDB with Python
Figure 9: Execution of RethinkDB instructions in the Python IDLE shell

Installing RethinkDB on Ubuntu

For the installation of RethinkDB on Ubuntu Linux, both 32-bit and 64-bit flavours are available.

$ source /etc/lsb-release && echo “deb http://download.rethinkdb.com/apt $DISTRIB_CODENAME main” | sudo tee /etc/apt/sources.list.d/rethinkdb.list

$ wget -qO- https://download.rethinkdb.com/apt/pubkey.gpg | sudo apt-key add -

$ sudo apt-get update

$ sudo apt-get install rethinkdb

Compilation and installation from source

To compile and install from the source, use the following commands:

$ sudo apt-get install build-essential protobuf-compiler python \

libprotobuf-dev libcurl4-openssl-dev \

libboost-all-dev libncurses5-dev \

libjemalloc-dev wget m4

$ wget https://download.rethinkdb.com/dist/rethinkdb-version.tgz

$ tar xf rethinkdb- version.tgz

$ cd rethinkdb- version

$./configure --allow-fetch

$ make

$ sudo make install

To start the server on Ubuntu Linux, execute the following instruction from the terminal window:

$ rethinkdb

Creating a table using the Data Explorer tab in Explorer

In the RethinkDB administrative console, there is a Data Explorer tab, which provides the panel for executing the commands that will work on the databases. In the following example, a new table titled chat_users is created in the database titled mydatabase.

r.db(‘mydatabase’).tableCreate(‘chat_users ‘)

On clicking the Run button at the bottom-right of the administration console, the query is executed and the operation is implemented. In addition, the key combination Shift+Enter can be used to run the query entered.

To insert records in the form of JSON in the table chat_users, use the following commands:

r.table(‘chat_users).insert([{ name: ‘Username-1’, age: 17 },

{ name: ‘Username-2’, age: 35 }])

To count the number of records in the table, give the following command:

r.table(‘chat_users).count()

To display the records of application users aged 20 years and above from the table, use the following command:

r.table('chat_users).filter(r.row('age).gt(20))

Programming with RethinkDB using Python

Python is a free and open source programming language that has interfaces for almost all NoSQL databases. It is widely used for cloud applications and assorted real-time high-performance Web applications. In Python, RethinkDB can be easily installed and mapped using Pip, using the following command:

WindowsDirectory:\> python -m pip install rethinkdb

The package installer of Python automatically fetches the libraries of RethinkDB and maps them with the existing installation of Python.

After mapping RethinkDB with Python, the IDLE shell can use the methods and APIs of RethinkDB.

>> import rethinkdb as <Handle>

Using the Python IDLE shell, or within the script, the connection can be opened with the following command, which includes a client driver with 28015 as the port number:

>> r.connect( “localhost”, 28015).repl()

The repl command sets and prepares the default connection with the Python shell.

The function table_create() is used to add a new table in the particular database, as shown in Figure 9.

Previous articleOpen-source library to secure AI systems
Next articleOpen source can strengthen decision making
The author is the managing director of Magma Research and Consultancy Pvt Ltd, Ambala Cantonment, Haryana. He has 16 years experience in teaching, in industry and in research. He is a projects contributor for the Web-based source code repository SourceForge.net. He is associated with various central, state and deemed universities in India as a research guide and consultant. He is also an author and consultant reviewer/member of advisory panels for various journals, magazines and periodicals. The author can be reached at kumargaurav.in@gmail.com.

LEAVE A REPLY

Please enter your comment!
Please enter your name here