PostgreSQL 9.0 Released!

Hi There,

It is time to celebrate! The PostgreSQL team has released its 9.0 release today! My deepest congratulations to the whole team.

This release has a special taste for me. Many of you know that I have been using PostgreSQL in my day-to-day job since 2004, and six years is enough time to figure out some things.

First, this release has lots of largely expected features, that are better explained at http://www.postgresql.org/about/news.1235.

Second, it’s nice to see how things are happening inside PostgreSQL. Their team is compound by very skilled people whose commitment is to keep writing what is the most advanced open source database engine in the World. And it really is not only from the code point of view but also from the community point of view.

Third, if SCOracle continues its massacre (and everything seems to indicate they won’t stop, but I don’t care, neither Oracle nor MySQL are choices for me), former MySQL users will migrate soon or late to other alternatives: many will migrate to NoSQL (and for God’s sake, it’s not a choice for everyone but they will indeed migrate because it’s the hype), others to small databases (such as SQLite or Apache Derby), and others will migrate to PostgreSQL. All these alternatives are valid, and it will be proven that the Open Source universe has alternatives to almost every single piece of code. I expect PostgreSQL 9.0 will raise its number of deployments in the next year.

No matter what happens, I’m proud of being a PostgreSQL user. And if at least one PostgreSQL developer reads this, I repeat: Congratulations, yours is the best job in the World.

Regards!

Posted in Computer Stuff, Databases

Turkmenbashi Daemon Creator 1.0.0

Hi There,

The first release of the Turkmenbashi Daemon Creator was published a few hours ago. It is released freely under a BSD License.

You can find it here in the Python Package Index or here in my Github repository. Oh, and I have a page here about this release!.

History

Once upon a time I was asked to create a collection of UNIX daemons for specific purposes in Linux. I decided that instead of following the instructions given step by step, I can create a library containing the basics and then focus on specific daemon behaviours.

The UNIX daemons that I wrote had something bad: they all had a copy of this library. I merged some functionality into an unique library that has been in production since february. Finally, I decided to open source this library. Hopefully someone will take advantage of it. If you are reading this and will use my library, please drop me a line! I would be glad to hear news about it.

Regards.

Posted in Software Development

CouchDB… database developers headache?

Hi There,

Today one of my partners at the company I’m working for asked me the following: “How to migrate an entire application with its data from a relational database (let’s say, at least MySQL with InnoDB) to a non-relational one, such as CouchDB? It is an interesting question, especially for a database-oriented person like me. In this post, I’ll try to answer you all.

What is CouchDB?

First, let me explain what is CouchDB: It is a non-relational, partially structured document-oriented database, maintained by the Apache Foundation. It works via HTTP protocol, and it’s quite easy to perform replication tasks and all sorts of magic.

I want to make it clear: it is not relational, it is not an object-oriented database. It’s all about structured documents.

Sooo… how to create applications using it? There is no structure!

This affirmation is wrong from the beginning. Data must need a structure, but it does not imply that it has to be saved in schemas, tables and so on; LDAP does not have a table structure but a tree structure. Even some games like Need for Speed, Quake and Fifa 2010 need their data organized in structures.

If you think of it, you will learn that the way you store your data really matters: a relational database is not efficient to store maps for Need for Speed, a file is certainly the worst way to perform operations concurrently. No-SQL databases are far more efficient than SQL databases in retrieving data, but you will probably finish with a poor design if you have a software architecture that needs 500 document types. (And, no. Once upon a time, a teacher at my university said that nothing with more than 30 tables is useful; the reality is that when you have many applications that are related to others, you can finish up with a huge database containing more than 300 tables. Have you ever seen the database design behind OpenBravo?).

(Update: here you can find a better introduction to what can be achieved with NoSQL. It solved many of my doubts about it.)

That said, I’ll explain the traditional process to create software, in short terms:

  • Understand the problem.
  • Design a solution.
  • Write code.
  • Deploy.
  • Get the money.

It does not change. You still have to design a solution, but now you do not have a relational database structure. Instead, you have to define structures (and document them properly) to represent your data.

And I must point it here: CouchDB is still a database. It is still required to have CRUD (Create, Retrieve, Update and Delete) methods for your data. Instead of creating complex SELECT, INSERT, UPDATE and DELETE statements, however, we serialize/unserialize data between the application and CouchDB.

What do I see here? On one hand, you don’t have to worry about inserting foreign keys correctly, instead of saying (1 => ‘Male’, 2 => ‘Female’), you can place the word directly to represent it. On the other hand, if your application has inconsistencies (‘MALE’ and ‘Male’ are not the same thing), you have to be careful with data consistency, something that you did never care of while developing applications.

As a result of this, if you need to migrate your data from a relational database (Let’s say, you had Oracle) to CouchDB, you’ll be in one of these situations:

  • I have different models in the database: Remember that you won’t have a relational database. You will have to identify a document in CouchDB by its type (Person, Post, Project, Task, et al).
  • I have some significant tables that must be in the new “model”: Migrate these fields from the relational database and create an unique identifier for each document on CouchDB (a name or slug may be enough). For “Person”, it’s ideal to provide an e-mail or a nickname.
  • I have a one to one relationship in my relational database: Forget the rule about “the record, only the record and nothing else but the record“. Perform a SELECT * FROM table1, table2 WHERE table1.id = table2.id; on your relational database. This gives you an idea of what you want to send to the CouchDB resulting database.
  • I have a one to many relationship in my relational database: Are you sure that that table is needed? Most of these tables are defined to hold parameters in the database (gender, city, marital_status et al). If that is the case, there is a human-readable identifier such as “Male“, “London“, “Single“, that needs to go in the SELECT. Otherwise, you can store collections in the database, such as “comments”: [ "commentx", "commenty", "commentz", "commentw"]. If it is not enough (by example, “project” to “task“, the project slug must be stored in the task. This will still represent a “relation”. Cough, I said that there are no relations in the database… The truth is that this relation only exists in your mind (that is what we database developers use to call a “weak relation”).
  • I have a many to many relationship in my relational database: You can rethink it as an one to many relationship and model it on the new document type as described before if it’s not significant. Otherwise, you will need to store a document with that relationship.

Once you get used to it, you’ll learn that there are lots of paradigms, but these are not the important thing when speaking about data. The real son of Chuck Norris understands that being successful in computer software development is to handle data structures rather than stick on these paradigms.

Hopefully it will help you all. And, thanks to Álvaro Véliz for introducing me to CouchDB. It seems cool.

Regards!

Posted in Databases, Software Development

Busy weekend

Hi There,

I have been very busy this weekend, helping my friend Daniela with a software application she has to show on monday.

I am still in Chile, and I do not plan to get back to London in the next two months, at least; I have very good friends here, and wish to keep them.

Oh, and watching the news, I realized that the 33 men who were trapped in a mine are all right after 17 days of uncertainty. That is such a cool thing about chileans.

Regards!

Posted in Personal

Knocking on Heaven’s Door

A Project Leader passed away.

When he arrives Heaven, knocks the door and St. Peter opens it. Surprisingly, there are millions of angels happy to receive him as if he were a rock star.

The Project Leader asks:

- What is wrong here?

St. Peter answers:

- Nothing is wrong. We are very happy because you are the first person who has lived for 275 years.

The Project Leader did not understand anything. The situation was too confusing for him. Then he thought it was a joke and said:

- But I’m just 37 years old!

To which, St. Peter’s replied:

- It’s impossible! We performed some calculations on the hours you charged your customers and the result is exactly 275 years, 9 months, 3 days and 11 1/2 hour!

Posted in Personal

Danielle

Hi There,

I am happy today, and the reason is that someone I love sent me an e-mail after a very long time without knowing of each other. She is a powerful source of inspiration for me.

Danielle, I am glad to know that I am still in your heart, because you never left mine and you will never leave it. This was not an adventure because we never had a final good bye and the story of our friendship can continue wherever we are and whenever we want to keep writing it. Sadly, distance can separate us, but it is our task to keep our feelings alive. I love you, and this feeling did not end. That is why I write a special kiss to you.

Regards.

Posted in Personal

PostgreSQL: Don’t forget to run Vacuum.

Hi There,

A week ago I spent some time in query optimization over a daemon running on top of a very small PostgreSQL server. Today, the database server became slower, apparently with no reason. It became worse: a colleague went back to the previous version of that source and it performed even slower.

The software

The daemon uses PostgreSQL as a message broker between applications. It is known that a real message broker should perform better but I did not program it.

The problem

The problem is that this piece of shit code performed about 20 inserts/updates/deletes to tables. Not bad at all, PostgreSQL can handle it, but nobody executed VACUUM before. After running VACUUM ANALYZE, I found that there was enough garbage to fill three football stadiums.

How (not) to solve it

Do a VACUUM FULL right away!

Why

VACUUM FULL blocks the table while performing cleanup tasks. It is much better to run a plain VACUUM periodically. But do not forget to run it. See below.

When inserting data, PostgreSQL reserves enough space in the database and then performs a write. This is to preserve the data integrity and ensure that the record is well written.

Same thing happens while updating: PostgreSQL inserts the new unmodified record and then disconnects the old record from the database in an atomic (or close to be atomic) operation. It is much better for consistency to do it this way instead of writing on top of the already written record, but it produces garbage in the database. That is why we should periodically perform VACUUM.

When to use VACUUM FULL?

Check the documentation. VACUUM ANALYZE will provide information on when to use it, but it all depends on intensive usage. For a database where the data is mostly inserted and selected, probably twice a year. And when VACUUM FULL is performed, it must always be followed by REINDEX.

And the result?

My boss is happy.

All the best.

Posted in Databases