I made a mistake that I must correct. I stupidly broke my rule of only writing comments here, and made a couple comments on this post complaining about a bunch of problems in MySQL.
After being a happy PostgreSQL user for years, I’ve finally had to use MySQL for the last 6+ months.
I have to admit, I was starting to think it wasn’t all that bad–that the “MySQL is a toy” line was outdated.
– Dammit, MySQL
I’ve been developing database applications, mostly for the web, for seven years now, professionally for five. I’ve used every kind of database for every kind of app. I’ve used MySQL, Postgres, sqlite, CouchDB, and some others. I’ve also worked on applications ranging in size from a tiny blog nobody reads to gigantic high traffic applications with millions of rows, or millions of hits.
Lately I have noticed that there is a very strong backlash against MySQL. That original blog post was linked from Hacker News, and if you look at the comments there, everyone who is in favor of MySQL is getting downvoted.
Let me tell you something. All these technological complaints people have with MySQL, they are 100% true. MySQL has no transactions. Its foreign key constraints are not up to par. It doesn’t follow standards perfectly. There are very real possibilities of your data becoming invalid, or even corrupt. MySQL is far from being a super perfect database under the hood.
If all those things are true, then MySQL sucks, and we should never use it, right? Wrong. Even though MySQL has these issues, they simply do not matter for the overwhelming majority of applications. If you’re working for a bank, a hospital, or the military, then MySQL is right out. In those cases, reliability and data integrity are so important, and the system is so large, you just can’t be using MySQL. I’m willing to bet you’re not doing any of those things, are you? You’re doing a blog, or maybe a social network.
You might say that you are doing a blog, but PostgreSQL and MySQL are both free and open source. You may not absolutely need the data integrity and standards compliance, but why not have it anyway? It won’t hurt, right?
I disagree with this. In my many years of using both Postgres and MySQL, my experience is that Postgres is extremely finnicky. Because it demands standards compliance and data integrity, it demands a lot more work from the application developer. Let me give some examples I have encountered in the real world.
Transactions are a big reason Postgres is so great. You can make sure your data is always correct because a group of writes will go in all together. This is very important in a serious application. The thing is, if you don’t need transactions, using them can cause you a great deal of trouble. In building multiple Django applications on Postgres, I have seen a very frequent error when a transaction has been opened, but not closed. The result is that the entire database is stuck. You can’t read or write from it until the transaction is closed out. Whenever this happens, I just restart the db server, and everything is back to normal.
It shouldn’t be possible for this to happen! The fact that restarting the database fixes it is proof that it doesn’t even need to happen. Yes, transactions are good, but a small user error causing the entire database to go out of service is much worse. In a huge application, you can take the time to test and make sure you don’t make those errors. For a normal every day application, you just want it to stay up. You don’t care about some unclosed transaction. Just let it slide.
A similar situation happens with the much lauded foreign key constraints. I’ll have myself two tables with a foreign key in one pointing at the other. Then something will happen resulting in a related row being deleted. It’s probably deleted because I wanted it to be deleted. The thing is, Postgres doesn’t like this. It’s invalid data. You will now get all sorts of errors from Postgres because the data is invalid. This is going to cause all, or part, of your application to go down with errors.
MySQL ignores this, and wt ill continue servicing your queries. Your application will see no downtime. For normal people with normal sites, having their web site stay up without error is a hell of a lot more important than a dangling foreign key. In most real world cases, that broken foreign key constraint won’t cause any issue whatsoever. Pages not serving is a gigantic problem for even the smallest site.
Another problem Postgres used to have is lack of replication. Every place I have ever worked with MySQL has used the master/slave configuration. It’s just so easy to setup, and has been for a long time. Postgres just added the feature, so how much can you trust it? Even if you do trust it, it’s very hard to configure it properly. There are third party solutions like PL/Proxy and Slony, but they are even more difficult to configure. MySQL makes it easy, and that is my final point.
MySQL is very easy. It has extremely user friendly commands such as ‘show databases’ or ‘show full processlist’. In Postgres, you have to learn commands like ‘\l’ and ‘\d’. I’m a vim user, so I have no problem with this, but what if you’re some guy with a blog and not a vim using developer? Postgres is just really cryptic. MySQL is much more forgiving of the user, and more accessible to normal people.
Another example of MySQL’s ease of use is in its authentication model. Granted, it’s not all that simple. You specify [email protected] with a password. Then you give that user permissions on different databases, or global permissions. It’s a little complex, but it’s comprehensible to normal people. It’s the same concept as the permissions they setup when they do a PHPBB, or something like that. The administrative user has all global permissions. The moderator user can update this table, insert that table, but only select these other tables.
Postgres has the pg_hba.conf. The first time I encountered it, it took about 20 minutes of reading documentation before I understood what the hell was going on. Yes, I fully admit that this configuration gives a lot more power and a lot more options than the MySQL way of doing things. The thing is, none of those options are useful for every day people. The only thing I ever do in terms of database permissions is I create a new MySQL user and password with full permissions to each specific database. Then I give that username and password to the application in question. In MySQL, this is the default, and PHPMyAdmin makes it easy. In Postgres, that is about as far from the default as you can get.
There has been a migration of database popularity, especially in the web application world. People started with MySQL. They were upset about data integrity, so they moved to Postgres. Then they were upset about slowness, so they moved to “NoSQL.” The thing is NoSQL has all the same problems as MySQL. Have you not read all these stories about people losing all their data in MongoDB? Losing all your data is a much bigger problem than a dangling foreign key or a broken transaction.
Also, people seem to forget that MySQL has multiple choices of storage engine. Most of the complaints with MySQL data integrity are actually complaints with the MyISAM storage engine. If you switch MySQL to use the InnoDB storage engine, many, but not all, of those complaints are now invalid. The thing is, most people don’t need InnoDB, the same way they don’t need Postgres. I actually think that anyone using InnoDB should use Postgres instead. InnoDB is a lot slower than MyISAM.
Look at all the companies people are talking about that are supposedly using NoSQL. Digg, Twitter, Facebook, bit.ly. People say things like “bit.ly uses MongoDB.” What they don’t tell you is that bit.ly uses MongoDB for just one feature of the site. The actual core of the bit.ly service, the URL shortening, still uses MySQL. The same is true at Facebook. They use Cassandra for some stuff, but the core of Facebook is still MySQL. If it’s good enough for them, then it works. If it works, it’s good.
I understand there are some Postgres/database type people out there who are probably going insane if they read this. They have some sort of OCD-like situation where they need all the data to be exactly perfect, or it drives them crazy. You know, if I look at some of my MySQL database, there is bad data in there. I see it with my own eyeballs. It’s a tiny bit annoying. But then I look at my actual application. What do you know, my application is still running. It has no errors anywhere. It’s fast. It’s working. That’s all that matter. Everything else is just anal developers who need to get over it.
If you’re doing a serious big application, use Postgres, please. I do not want data errors ruining my vacation when I book a flight. I do not want my bank to lose my money. If you’re doing another every day web site, use Postgres if you want, but it’s not necessary. MySQL is faster, easier, and it works. The web site will load, and we can all rest more easily.