There are a plethora of free/open source databases around, from the good old Berkeley DB, SQLite, MySQL, PostgreSQL, and the newer NoSQL DBs like MongoDB, to mention a few. Most of these have easy-to-use GUI interfaces too. As a result, the threshold to becoming a “database administrator” has become very low, and the quality of the average database is abysmal. People who do not know the A or B of database design are happily doing mission-critical stuff. Referential integrity is unheard of, and in the interests of temporary speed gains, the concept of normal forms is discarded.
As for security, don’t make me laugh. SQL injection was discovered in the last century, and the prevention is simple and well known — but guess which is still one of the most popular ways of cracking websites?
I try to keep this column as non-tech as possible, so I do not want to go deep into all of this. Since it is probably adequately covered in other articles in this issue, I will restrict myself to recounting my personal experiences in the world of SQL, as well as some things I have learned from the course in Python/Django that I run, which covers RDBMS usage.
I started my journey with MySQL, which at that time was the only option available and to be frank, used it for several years without knowing a thing about database design, or SQL, for that matter. I used Webmin to create and administer the DB, and Perl to connect the data to my application.
At that time, MySQL did not have foreign keys, transactions, sub-queries and a lot of other features that one nowadays associates with the idea of an RDBMS. Also, the manual had a lot of pages loudly proclaiming that these things were unnecessary. So I learned a lot of bad habits. And I had to write a lot of code to make sure only valid data went into the DB, and valid stuff came out of it. This was because I did not trust the DB.
At that time, I started working with an open source application that used PostgreSQL — so I asked the author why he did not have a MySQL version. He responded that the integrity of the data is the responsibility of the DB and that he had no intention of writing huge amounts of code to do what the DB should do anyway.
Lesson Number 1: The database has to look after the integrity of the data.
So I shifted to PostgreSQL. But there was no usable GUI for it. I was frantic, but there was no alternative. I had to learn command-line SQL — and since PostgreSQL was more or less ANSI-standard, I had to learn SQL. Since I had to find out what all the fancy “new” things like foreign keys, sub-queries, triggers, views, transactions, etc., were, I had to learn a bit of database design theory and got introduced to normal forms.
Although PostgreSQL will not let you do silly things like accessing it as the root or entering a date like 31st of February, or even referring to a non-existent table as a foreign key, it is quite possible to create a database in a MySQL kind of way.
Lesson No 2: Database design, especially adherence to the first two normal forms, is essential.
Then I went into production with a completely normalised database, which was well and carefully designed. My client was happy, since he was making money. And then, horror of horrors, he needed large-scale changes in the database structure. That’s when I learned lesson No 3: Any idiot can create a database, but it takes experience, skill, courage and command-line knowledge to alter one in production. In the ultimate analysis, no tool is going to help you.
This is the strangest article I have written for LFY, in the sense that I started out to say something, and wound up saying something else. But on re-reading it, I find it expresses something that I passionately believe in. So I hope someone, somewhere finds it of interest.