How a DNS problem can put your Mysql server down

Last week i was waked up from bed by the monitoring team from my company. There was a problem with my system, there was a DNS problem undergoing but as a side effect my app was down. Since it has a lot of traffic it had to be solved immediately.

I jumped to the computer and I quickly diagnosed the system. Everything was fine except the Mysql connection pool which was exhausted. The first thing that crossed my mind is that it was just a coincidence and I quickly ran show processlist to see a list of MySQL processes. The output was an infinite list of load balancer’s ip address having “login” text as status. In order to achieve high availability i am using Mysql by having a balanced ip address between two Mysql servers. The balancer runs a quick check every 5 seconds by connecting to Mysql and does a simple select on a table.

So for a particular reason the “load balancer” was not able to finish its login attempts and it was overloading my Mysql servers. While I was in the middle of the investigation the problem suddenly stopped. I was happy but somehow scared, i had no idea what the hell happened.

A quick search into Mysql documentation reveals that Mysql is doing a reverse DNS lookup which was the cause of my problems. Since the DNS server had a problem, the operation of reverse DNS was taking far more that 5 seconds to time out. This resulted in overloading the database servers. Check this explanation in the official documentation, How MySQL Uses DNS

After reading tha page I think that mysql needs this reverse DNS lookup only for its permission module and if you don’t use host names with the grant option then you are safe to disable this option. I quote here the parameter which does this:

–skip-name-resolve

Do not resolve host names when checking client connections. Use only IP numbers. If you use this option, all Host column values in the grant tables must be IP numbers or localhost. See Section 7.5.11, “How MySQL Uses DNS”.

I have been able to avoid this? Perhaps, but considering that I used MySQL in production for the first time, it is unlikely to think so.

Long live the reverse DNS, cheers!

MySQL Index Performance

When you create an index at design time, you only can guess what would happen with you application in production. After you are live, the real hunt for creating indexes is just begining.  Very often we tend to choose bad indexes ignoring basic rational thinking. I’ve seen a lot indexes that were slowing the application instead of increasing speed.

Right now, the current databases advanced so far that the differences between indexing a number column and indexing a varchar column are not so obvious anymore. Either you think to create an index on a varchar or on a number column, first you need to lay down the selects that you are going to run against that table.  Not doing so is a waste of time. Next, think at the distribution.

For example let’s assume that we store 11,000,000 users in a table called my_users. We would have the following fields: username, email, first name, last name, nickname, birthday, age, gender, country.  Our application will search for users using the following fields:

  1. username
  2. first name, last name, gender
  3. email

For 1 and 3 the indexes are obvious, we could make an index on username and an index on email. If the username is unique their selectivity will be equal with the primary key selectivity. But what is selectivity? A simple definition is that the selectivity of a column is the ratio between # of distinct values and # of total values. A primary key has selectivity 1.

So coming back to case number 2, what would be the best indexes? Let’s take the gender column first. Here we have only two possible values M and F. This means a selectivity of 2/11,000,000 which is 0, an awful index. If you have such an index you may well drop it because a full table scan could be more efficient than using this index.

How about first name and last name? This is a little more complicated, it differs on what names are you storing. If you have 30,000 of Johns and 50,000 of  Smiths the index is useless, a simple select distinct on each column will give you the number to calculate the selectivity. If it is above 15% then the index is good, otherwise drop it. But one great thing is that in this case you could create a composed index on first name+last name which it will give you a higher selectivity for sure . Don’t forget  that in MySQL an index on string columns allows only 1000 characters, which means when using UTF-8 you can only inlcude 333 characters.

Select the worst performing indexes by using the following sql. Note that composed indexes will apear multiple times, in the result, for each column so you need to pick the last apearance. Everything is below 15% it needs to be analyzed.

/*
SQL script to grab the worst performing indexes
in the whole server
*/
SELECT
t.TABLE_SCHEMA AS `db`
, t.TABLE_NAME AS `table`
, s.INDEX_NAME AS `inde name`
, s.COLUMN_NAME AS `field name`
, s.SEQ_IN_INDEX `seq in index`
, s2.max_columns AS `# cols`
, s.CARDINALITY AS `card`
, t.TABLE_ROWS AS `est rows`
, ROUND(((s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) * 100), 2) AS `sel %`
FROM INFORMATION_SCHEMA.STATISTICS s
INNER JOIN INFORMATION_SCHEMA.TABLES t
ON s.TABLE_SCHEMA = t.TABLE_SCHEMA
AND s.TABLE_NAME = t.TABLE_NAME
INNER JOIN (
SELECT
TABLE_SCHEMA
, TABLE_NAME
, INDEX_NAME
, MAX(SEQ_IN_INDEX) AS max_columns
FROM INFORMATION_SCHEMA.STATISTICS
WHERE TABLE_SCHEMA != 'mysql'
GROUP BY TABLE_SCHEMA, TABLE_NAME, INDEX_NAME
) AS s2
ON s.TABLE_SCHEMA = s2.TABLE_SCHEMA
AND s.TABLE_NAME = s2.TABLE_NAME
AND s.INDEX_NAME = s2.INDEX_NAME
WHERE t.TABLE_SCHEMA != 'mysql'                         /* Filter out the mysql system DB */
AND t.TABLE_ROWS > 10                                   /* Only tables with some rows */
AND s.CARDINALITY IS NOT NULL                           /* Need at least one non-NULL value in the field */
AND (s.CARDINALITY / IFNULL(t.TABLE_ROWS, 0.01)) < 1.00 /* Selectivity < 1.0 b/c unique indexes are perfect anyway */
ORDER BY `sel %`, s.TABLE_SCHEMA, s.TABLE_NAME          /* Switch to `sel %` DESC for best non-unique indexes */

This is taken from MySQL forge

Simple way to scale your Web App – Part 1

Every time i made an J2EE application, my main concern was about how many requests it could handle in the end. In my early dawns of my career I have been in the situation when my app was his own victim of success, to much load for a single server. We always had a reactive attitude and tried to deal with the problem when it happened, but some times it was too damn late. To be able to scale you application it must be made to be scaled.

But what would be a simple scalable architecture?

Simple Web Application Architecture

Let’s consider the diagram from the left.  Our application will be splited in three logical clusters.  First one is the application cluster, second database cluster and finally logging and statistic cluster.

The Load Balancer

A load balancer distributes the traffic among you application servers.  It can be  software or a hardware device.  A handy solution is to use a software balancer, such an Apache, but the software solution is not so robust and performant as a hardware balancer.

Hardware

A dedicated device for load balancing is more suitable and gives you more performance. Thus, this comes with an additional costs but there many devices on the market and you should choose the best one on cost/ features. When evaluating a load balancer some things must be kept in mind

Software alternatives

To balance SSL connections your balancer should provide SSL termination capability. Otherwise being a connection level protocol the SSL connection should persists between server and client by allocating the same host to the same client.

More about Load Balancing in a future post.

The Application Cluster

To be able to scale a web application, it has to be designed to scale. The main issue is session replication. Depending on the load balancing algorithm you will need to replicate the session or to use sticky session.

Stateless

This mode doesn’t require much from a load balancer. This is very common to REST applications.  For a web 2.0 application this could be the common aproach. The application stores everything it needs on the client side.  The first user request goes on the first machine while de second will hit a different machine. No data has to be shared between web servers. To handle more requiest new servers can be added in the web pool and the system will scale out.

Having a stateless design allows us a seamless failover. This can be achieved no matter what language we use to develop our application

Sticky session

Some times our application needs to store user specific data at session level. This means that every time a user hits the server we need some data to be able to process user request, we need local data and state. This data must be available on all server where the user request come. To cope with this problem we can use “sticky session” which means we need to ensure that the user will hit the same server as the initial request.

The most common aproach with sticky sessions  is:

Session replication

This technique is very common in J2EE where the Servlet Containers provide a way to replicate the session between the servers. There are several condition for a session to be replicated but it can be a viable alternative to sticky session.

The Database cluster

Obviously, my immediate choice will be to use MySQL as database server. It’s free, has a lot of community, it serves a lot of well known web 2.0 sites.

Using MySql you can scale horizontal by using MySql’s replication mechanism. When the database grows is time to partition our data horizontally into shards.

Replication

Master-Slave

This type of configuration will help you to scale the reads from the write.  The reads will always go to the Slave while the transactions that alter the data will go to the Master.  You can have one Master and multiple slaves.

Master – Master

Such an approach will distribute the load evenly and it also provides High Availability. MySQL 5.0 provides replication at statement level which often can crash the replication because of conflicts. The most common conflicts i ever met are for unique indexes but you can coupe with this problem by using “replace” command instead of insert. Anyway the MySQL 5.1 will have some semnificative changes in the replication module.

Tree Replication

This gives you a lot of posibilities, you can conbine master-master with master-slave in a tree structure and conbined with sharding it can result into a fine tuned MySQL cluster. More about Tree Replication and data partitioning in further posts.

Logging & Batch processing

In the end we reached the final component of our application.: the logging server and the batch processing machine. Why do we need them? Simply because somebody has to do them. Every application has some batch processing to be done. This is done usually by using cron and scripts. I recommend to use a scripting language for batch processing such as: bash, ruby, python etc.

For logging I would suggest you to use syslog or syslog-ng which has more advanced features than syslog, better performance in terms of cpu and supports UTF-8.

What’s next?

Next I would like to walk step by step through designing a J2EE based on the architecture discussed above by using Apache, Tomcat, Struts, Hibernate and MySQL.