- Peter's InnoDB architecture and Optimization - After all these years and a great business success, Peter Zaitsev has not lost his technical attitude and I am sure he will make this tutorial extremely interesting. This is definitely a must see if you use MySQL extensively and if you need to squeeze as much as possible from a MySQL box.
- Henrik Ingo's Evaluating MySQL High Availability Alternatives - Henrik made this talk an evergreen - definitely a must see if you are a DBA coming from a different database and you want to explore HA alternatives with MySQL.
- Sergei and Timour's Advanced query optimizer tuning and analysis - From two of the original developers at MySQL AB, you can probably increase your application performance 10x, if you understand how the optimizer will treat your queries.
- The three keynotes on Tuesday - They are a great start for the Conference. I am glad we will see Tomas on stage at Percona Live!
- MariaDB Cassandra Interoperability from Colin and Sergei - This is what I think is one of the most innovative topics at the Conference. I believe that the integration between NoSQL and MySQL needs more refinements, but this is really bleeding edge.
- Seppo Jaakola's Galera Cluster Best Practices - Directly from the creators of Galera, again, another great innovation in the area of HA for MySQL, probably the most important after MySQL Replication (I am not considering Cluster/NDB in this scenario).
- Deploying a HA solution in EC using MariaDB from Massimiliano Pinto and Mark Riddoch - I have to admit, Mark and Massimiliano work in my team and I know what they are up to. They will present software modules and best practice that will be helpful for everybody who wants to use EC2 as database infrastructure. So, go and listen to what RDS can do for you, and compare it with this.
- Colin's and Monty's MariaDB 10 and what's new - Once you know what's new in MySQL 5.6, it is definitely a good idea to compare it with MariaDB 10 and take your own conclusions on which one fits better in your infrastructure.
- Nagavamsi's Online shard migration at Facebook - Ideally for large MySQL installations, it is good to know a bit more about the tips & tricks that may come from the MySQL team at Facebook.
- Robert Hodges's State of the art of MySQL Multi-master replication - Again, if you are interested in the topic, Robert's talk is definitely a must see, presented by one of the most competent persons to date.
- Using MySQL Performance Schema from Zburivsky Danil - The Performance Schema is a great addition to MySQL, it is again another good talk for developers and DBAs.
FOSDEM 13 is now over, I am on my way home and I would like to share some thoughts sparkled by the intense atmosphere that I have lived in these two days in Brussels.
On 29 November last year, SkySQL and Monty Program jointly announced the release of the so called "MariaDB Client Library for C Applications" and "Maria DB Client Library for Java Applications", which I will call C and JDBC connectors here. You can follow this link to read the press release from SkySQL.
Last week Baron Schwartz posted some findings re the C connector in his blog. Today, another post from Robert Hodges added few details and were looking for answers. I think I can help a bit in understanding what we have released and how the connectors can be used. I will not comment the accuses of plagiarism mentioned in Baron's post, for two reasons. First of all, I think plagiarism is a serious accusation that refers to illicit actions (but I am not a native English speaker and I may be wrong) and I do not have any legal expertise. Second, I think that any discussion around the ethical or unethical use of somebody else's code, especially when it is based on the number or on the percentage of code changed, is a pretty slippery field and it is too prone to individual interpretations.
The C connector in [very] short terms
You can find information regarding the C connector here and here. The connector is based on the MySQL Client Library v3.23.58. If you look at the source code of both libraries you can immediately spot that there are significant differences. Take the libmysql.c file for example, you will find significant changes within the C functions and some new functions, such as cli_report_progress, changes to support the 5.X protocol, connection/close/reconnect etc. Obviously, libmysql.c is not the only module to show changes - take the prepared statements in my_stmt.c and my_stmt_codec.c for example, where Monty's team worked on the porting from the mysqlnd extensions.A bit more history and extra info for the JDBC connector
I can certainly add a more history and info for the JDBC connector. Mark Riddoch (our head of Engineering) and his team have actively participated to the improvement of the connector, together with Monty Program's team. What you can read below is a summary of what happened and the areas that they have improved.The JDBC connector is directly derived from the Drizzle connector. When we looked for a Java connector that could work as alternative to the MySQL Connector/J from Oracle, we were very happy with the great job that Markus Eriksson did. The only problem was that the Drizzle connector, although it conformed to the basic JDBC interface, had been stripped down to the bare essentials. This is good for the objectives of the Drizzle project, but we wanted to offer something that was more generally useful for the MySQL and MariaDB™ community at large. The problem was, of all the vast array of missing features, which should we tackle first? We embarked on a lengthy evaluation program to determine what features to add either ourselves or to fund Monty Program to add to the driver.
Our first approach was to take a few popular Java applications and test these against the connector. One of the issues with this of course is to identify applications that are a reasonable representation of what an end user might use. Also we have to test these applications to fully investigate the features that the application uses with the driver. Given the Java linkage model this means we need to effectively do as complete a test as possible for each application in order to fully exercise the JDBC API.
Using this approach we quickly found some of the more major areas of incompatibility, namely the fact that the URL used by the original Drizzle connector had an incompatible syntax for the URL and was missing compression. Our aim was to provide a drop in compatible JDBC driver, therefore having a compatible syntax for the URL was an obvious first fix to make, as was the addition of compression.
Although this approach seemed to be reasonable at first sight, the overhead in setting up various applications, creating data sets needed by the application and then manually testing the application was a time consuming operation. As an enhancement to this approach we decided to look around for an application or a framework that had some form of automated test procedures that we could use to test the interface to the database. We choose the Hibernate framework as our vehicle for this since it contains a set of automated tests to verify the SQL dialect it is using. This enabled us to automatically run a large set of tests, across a diverse set of API methods of the JDBC driver. Crucially this enabled us to run regression tests very easily each time we fixed a bug or added new functionality to the JDBC driver. The first runs of the Hibernate test suite yielded several hundred failures and very quickly led us to a number of fixes that we needed to apply to the driver that had not been picked up by the tests we had previously run. The lack of support for BLOB and CLOB datatypes caused a vast number of failures, so these were an obvious first target for our development of the driver. This was quickly followed by a number of other features such as better support for the manipulation of the metadata and schema aspects of the JDBC API.
The lack of support for streaming data sets, and the way of handling large result sets in the Drizzle driver required a number of changes to the driver, callable statement support and stored procedure support also need to be added to support these feature of MySQL.
The third phase of the development and testing of the connector was to engage with some external application developers who required a JDBC driver with the more relaxed licence requirements. This added a whole new set of issues that had not come to light with the Hibernate testsuite. A number of these issues were related to the concurrent usage aspects that had not be highlighted by the Hibernate testsuite, there were a number of synchronisation issues that had not be solved, connection handling for high numbers of concurrent connections, connection leaks and general resource leaks were all highlighted by this test route. Some of the more interesting issues that this approach raised was related to the MySQL specifics that the more general Hibernate approach didn’t find; this included the various parameters that could be set as connection properties and reliance on particular semantics of the MySQL Connector/J.
Datatype support also became an issue as we tested with more and more application developers; temporal datatypes needed to be updated to support microseconds and numerous were found where the Drizzle driver had a different mapping between the database types and the native Java data types. In particular the getObject API call required some work to give true compatibility with the MySQL Connector/J.
One particularly interesting case was the getTable() and getColumn() API, used to get table information from the database schema. The implementable of getTable() in the Drizzle driver was 100% compatible with the specification of this routine in the JDBC specification, however the MySQL Connector/J was not current according to the JDBC specification. A particular application we were working with depended upon this particular behaviour in MySQL Connector/J, which was technically incorrect, so this gave us an issue; we had a driver that was correct but different from the one we wanted to emulate. So should we deliberately put what could be considered a bug into our JDBC driver? The solution we arrived at was to add a connection parameter to switch between the strict JDBC behaviour for getTable() and the MySQL Connector/J behaviour; so we added the NoSchemaPattern connection option for this switch.
At this point we believed we had something that was fairly usable, and started to widen the availability of the JDBC driver to more application developers in an attempt to ascertain if the driver was complete enough to be interesting to the general user base. There were still major areas of functionality that we knew to be missing, in particular the support for connection pooling and connection failover. In parallel to the widening of the testing we also added these features and support for introspection.
The improvements
In all we spent just over a year taking the Drizzle core and slowly testing and enhancing it before handing the driver over the Monty Program for release to the community.Here is a summary of the improvements that we made in the MariaDB connectors since initial fork, in chronological order:
- UseCompression option
- Connection string compatibility with MySQL Connector/J
- Implemented getWarnings/clearWarnings
- Added support for CLOB
- Added callable statements
- Allowed IP addresses in connection URLs
- Added large result set support
- Fixed synchronisation issue when multiple connections are used by threads in a single application
- Added an option to control the output of tracing messages from the MySQLProtocol class, using a connection option MySQLProtocolLogLevel=FINEST
- Fixed prepared statements so that the parameters are not automatically cleared between each invocation the prepared statement
- Added support for the MaxRows API
- Added connect URL options: socketTimeout, interactiveClient, localSocketAddress, createDatabaseIfNotExist
- Fix for IGNORE_BACKSLASH_ESCAPES
- Implementation of the IGNORE_BACKSLASH_ESCAPES option in line with the MySQL Connector/J treatment
- Update error handling for zero dates to correctly set SQLState
- Corrected object type returned by getObject for INTEGER columns
- Bring other object types returned by getObject into line with the MySQL Connector/J
- Added support for setCatalog and getCatalog. These follow the model my the MySQL driver and treat databases as catalogues
- Resolved quoting issue with setCatalog
- Added the NoSchemaPattern option to the connection string to enable compatibility with MySQL Connector/J. The schema pattern argument to getColumns and getTables are ignored by the MySQL if NoSchemaPattern is set to true.
- Update the remainder of the meta data functions to honour the NoSchemaPattern connection options
- Fix issue with naming of auto increment columns
- Clears the cached result set on closing a statement
- Removes the prepared statement cache
- Make the methods GetDatabaseMajorVersion and GetDatabaseMinorVersion work.
- Support for the javax.sql.pooledConnection class
- A fix for the pooled connection to tidy up transactions in the event of close() being called
- Extra methods on the MySQLDataSource class to allow the data source to be created via reflection:
- zero argument constructor
- public void setDatabaseName(String dbName)
- public String getDatabaseName()
- public void setUserName(String userName)
- public String getUserName()
- public void setPassword(String pass)
- public void setPort(int p)
- public int getPort()
- public void setPortNumber(int p)
- public int getPortNumber()
- public void setServerName(String serverName)
- public String getServerName()
- Addition of the setURL and setUrl methods on MySQLDataSource to aid creation of data sources using reflection
- Addition of setUser and getUser as well as the less used setUserName and getUserName methods
- Update to setURL to avoid overwrite of items set with individual methods if the corresponding value is not set in the URL itself
- Corrects a problem with final attributes in the JDBCUrl class preventing modification of connection parameters
- Addition of connection properties
- socketFactory
- tcpKeepAlive
- tcpNoDelay
- tcpRcvBuf
- tcpSndBuf
- Added in the ability to return generated keys in executeUpdate methods of a statement
- Addition of the dumpQueryOnException connection property
- Further fixes for the generated key support
- Corrected behaviour of GetObject on a resultSet with a TINYINT(1) column. If TINYINT1_IS_BIT property was set this previously always returned false.
- Fixed issue that prevented '-' character in hostnames in a URL
- Fixed issue related to BufferUnderflow exceptions
- Fixed issue with the mapping of CHAR columns into a Java type
Naturally, we want to improve both connectors. We are also seeking for contributors in testing, in providing feedback and ideas or even more if it is possible. The fact that Baron, Robert and others are blogging about the connectors is really encouraging. Suggestions and comments are more than welcome!
What is the value of a seed?
When I discussed the MariaDB Foundation with friends and colleagues, many said I was exaggerating. My believe is that the Foundation is important for MySQL and for the future of the IT industry - services and applications. Many agreed with me that the Foundation is important for the MySQL ecosystem, but involving the global economy and the whole IT industry is a bit of a stretch.
Fact is, the World Wide Web would not be as it is today without MySQL. MySQL was part of the LAMP stack. It powered - and still powers - some of the most successful web sites in the world. Without LAMP, companies like Facebook, Twitter or Google would have been developed in a completely different way or they would have not been developed at all.
The importance of MySQL for the Internet is pretty clear for many of us. What is perhaps less clear is what is the current situation of MySQL and the MySQL ecosystem.
If we look at the main development of MySQL today, we can say that so far Oracle has played a good and positive role in improving it. MySQL 5.5 has been significant milestone in the development of the database, especially because the development has been stagnating for a relatively long time. MySQL 5.6 has addressed some of the issues left open in 5.5 and it has been improved scalability and reliability. The point is, most of these improvements happened within one of the most important pieces of MySQL - the InnoDB storage engine. Oracle made several changes outside InnoDB, but some of them were really a catch-up with MariaDB, and the strong skill at Oracle are clearly in the transactional storage engine. At the same time, the team at MariaDB has worked on other pieces of the jigsaw. The optimizer is clearly one of them, other specialised storage engines, plugins and Replication are other aspects. In addition to Oracle/MySQL and MariaDB, Percona did a great job in applying specific patches to the server and in providing essential tools that make MySQL "a better MySQL".
Why can't Oracle improve in other areas?
So, the obvious question is: why can't Oracle improve MySQL in other areas? Of course they have the expertise to work on a better database and they can easily attract talented developers.
I think relies on two factors. First of all the InnoDB team appears to the strongest team at Oracle/MySQL(*). In the last two month the team has improved its knowledge with the addition of new developers as well. The second factor is purely economical. Despite the enormous resources that Oracle can put together, all must end with a costs/benefits analysis and with a positive number at the bottom. Oracle has a wide variety of great and profitable solutions in their portfolio: they simply act as anybody else would act in their position, i.e. they will be focused on the most profitable and promising technologies that can guarantee a great return of investments. Unfortunately, in long term MySQL is not one of them.
A completely different scenario
The scenario today is completely different from three or four years ago. When Oracle acquired Sun, there was very little competition between MySQL and the Oracle database. Yes, there were some examples of migrations or new applications being developed on MySQL, but the distinction between Enterprise and the Web applications, and consequently between the use of the technologies related, was pretty clear.
Things have changed dramatically though. The future - and most of the present - is in the Cloud and in Cloud services. In the next 5 years, SaaS technologies will expand more an any other software solution. Which database technologies are these SaaS providers going to use? But, above all, from which applications and traditional software solutions will SaaS providers grab customers? A detailed answer to this last question deserves another post.
An answer to the first question - which database technologies are SaaS providers going to use? - is a key element to understand why the Foundation is so important.
It is difficult to identify which technology will be mostly used, but it is clear that there will be a mix of NoSQL and SQL technologies.
SQL, NoSQL, NewSQL
NoSQL databases, believe it or not, are here to stay. Developers love them, they solve lots of problems that software engineers had to fight for years. As usual, they fixed some issues and they found some others, this is pretty natural.
Software engineers looked at NoSQL technologies because they could not find solutions to their problems in standard SQL databases. On the other hand, SQL databases cannot be completely replaced by NoSQL solutions. They need to work together. Eventual consistency and non-ACID databases work for some data, but not for everything.
There are many relational databases on the market, but when the selection is narrowed down to robust open source software, the choice can be made only between two: Postgres and MySQL. Postgres is living a glorious momentum, but it is far from the adoption and the ubiquity of MySQL, not to mention that the agility of MySQL make the database a perfect component for the Cloud, whilst Postgres seems to be still marketed as the open source alternative for Enterprise software.
Putting all together
In the end, why is the Foundation so important? In a single sentence, because it is backed by the key players in the MySQL ecosystem and it acts with only one and clear interest: to make MySQL a better MySQL, regardless the label applied to the software - MySQL, MariaDB or else.
MariaDB is evolving greatly, also thanks to the great improvements made possible by the InnoDB team at Oracle. On top of that, the MariaDB team has worked and is working on that important link that is the integration of SQL, NoSQL and NewSQL, on premises and in the Cloud. This is not only related to the creation of new storage engines, but also to the availability of other plugins and components that will make SQL and NoSQL a single, complete database solution.
I believe this great work will give a boost to the adoption of MySQL/MariaDB in the Cloud, for host providers, for public, private and hybrid clouds, and for SaaS providers. And again, there will be space for new solutions, made by bright entrepreneurs with great ideas and little money. And hopefully these new services will better serve new businesses at very low cost. Will see, but in the meantime, let's not wait, let's contribute!
(*) I do not refer to MySQL Cluster here
Labels: MariaDB, MariaDB Foundation, MySQL, MySQL in the Cloud, SkySQL
New great features in 5.6 Release Candidate and in MySQL Cluster 7.3
What about MariaDB 10?
And where is the Cloud?
Look! You can see the heart beating!
What is the SkySQL Cloud Data Suite?
How can you help and where can you find help?
What's next?
Or, better say, I have not blogged for so long…
Labels: MySQL, Reference Architecture, SkySQL


