This release incorporates new features in the
NDBCLUSTER
storage engine and fixes
recently discovered bugs in MySQL Cluster NDB 7.0.9a.
Obtaining MySQL Cluster NDB 7.0.10. The latest MySQL Cluster NDB 7.0 binaries for supported platforms can be obtained from http://dev.mysql.com/downloads/select.php?id=14. Source code for the latest MySQL Cluster NDB 7.0 release can be obtained from the same location. You can also access the MySQL Cluster NDB 7.0 development source tree at https://code.launchpad.net/~mysql/mysql-server/mysql-cluster-7.0.
This release also incorporates all bugfixes and changes made in previous MySQL Cluster NDB 6.1, 6.2, 6.3, and 7.0 releases, as well as all bugfixes and feature changes which were added in mainline MySQL 5.1 through MySQL 5.1.39 (see Section C.1.10, “Changes in MySQL 5.1.39 (04 September 2009)”).
Please refer to our bug database at http://bugs.mysql.com/ for more details about the individual bugs fixed in this version.
Functionality added or changed:
Added the ndb_mgmd
--nowait-nodes
option, which
allows a cluster that is configured to use multiple management
servers to be started using fewer than the number configured.
This is most likely to be useful when a cluster is configured
with two management servers and you wish to start the cluster
using only one of them.
See Section 17.4.4, “ndb_mgmd — The MySQL Cluster Management Server Daemon”, for more information. (Bug#48669)
This enhanced functionality is supported for upgrades from MySQL
Cluster NDB 6.3 when the NDB
engine
version is 6.3.29 or later.
(Bug#48528, Bug#49163)
The output from ndb_config
--configinfo
--xml
now indicates, for each
configuration parameter, the following restart type information:
Whether a system restart or a node restart is required when resetting that parameter;
Whether cluster nodes need to be restarted using the
--initial
option when resetting the
parameter.
Bugs fixed:
Node takeover during a system restart occurs when the REDO log for one or more data nodes is out of date, so that a node restart is invoked for that node or those nodes. If this happens while a mysqld process is attached to the cluster as an SQL node, the mysqld takes a global schema lock (a row lock), while trying to set up cluster-internal replication.
However, this setup process could fail, causing the global schema lock to be held for an excessive length of time, which made the node restart hang as well. As a result, the mysqld failed to set up cluster-internal replication, which led to tables being read-only, and caused one node to hang during the restart.
This issue could actually occur in MySQL Cluster NDB 7.0 only, but the fix was also applied MySQL Cluster NDB 6.3, in order to keep the two codebases in alignment.
Sending SIGHUP
to a mysqld
running with the --ndbcluster
and
--log-bin
options caused the
process to crash instead of refreshing its log files.
(Bug#49515)
If the master data node receiving a request from a newly-started API or data node for a node ID died before the request has been handled, the management server waited (and kept a mutex) until all handling of this node failure was complete before responding to any other connections, instead of responding to other connections as soon as it was informed of the node failure (that is, it waited until it had received a NF_COMPLETEREP signal rather than a NODE_FAILREP signal). On visible effect of this misbehavior was that it caused management client commands such as SHOW and ALL STATUS to respond with unnecessary slowness in such circumstances. (Bug#49207)
Attempting to create more than 11435 tables failed with Error 306 (Out of fragment records in DIH). (Bug#49156)
When evaluating the options
--include-databases
,
--include-tables
,
--exclude-databases
, and
--exclude-tables
, the
ndb_restore program overwrote the result of
the database-level options with the result of the table-level
options rather than merging these results together, sometimes
leading to unexpected and unpredictable results.
As part of the fix for this problem, the semantics of these options have been clarified; because of this, the rules governing their evaluation have changed slightly. These changes be summed up as follows:
All --include-*
and
--exclude-*
options are now evaluated from
right to left in the order in which they are passed to
ndb_restore.
All --include-*
and
--exclude-*
options are now cumulative.
In the event of a conflict, the first (rightmost) option takes precedence.
For more detailed information and examples, see Section 17.4.17, “ndb_restore — Restore a MySQL Cluster Backup”. (Bug#48907)
When performing tasks that generated large amounts of I/O (such as when using ndb_restore), an internal memory buffer could overflow, causing data nodes to fail with signal 6.
Subsequent analysis showed that this buffer was not actually required, so this fix removes it. (Bug#48861)
Exhaustion of send buffer memory or long signal memory caused data nodes to crash. Now an appropriate error message is provided instead when this situation occurs. (Bug#48852)
In some situations, when it was not possible for an SQL node to
start a schema transaction (necessary, for instance, as part of
an online ALTER TABLE
),
NDBCLUSTER
did not correctly
indicate the error to the MySQL server, which led
mysqld to crash.
(Bug#48841)
Under certain conditions, accounting of the number of free scan records in the local query handler could be incorrect, so that during node recovery or a local checkpoint operations, the LQH could find itself lacking a scan record that is expected to find, causing the node to crash. (Bug#48697)
See also Bug#48564.
The creation of an ordered index on a table undergoing DDL operations could cause a data node crash under certain timing-dependent conditions. (Bug#48604)
During an LCP master takeover, when the newly elected master did
not receive a COPY_GCI
LCP protocol message
but other nodes participating in the local checkpoint had
received one, the new master could use an uninitialized
variable, which caused it to crash.
(Bug#48584)
When running many parallel scans, a local checkpoint (which performs a scan internally) could find itself not getting a scan record, which led to a data node crash. Now an extra scan record is reserved for this purpose, and a problem with obtaining the scan record returns an appropriate error (error code 489, Too many active scans). (Bug#48564)
During a node restart, logging was enabled on a per-fragment
basis as the copying of each fragment was completed but local
checkpoints were not enabled until all fragments were copied,
making it possible to run out of redo log file space
(NDB
error code 410) before the
restart was complete. Now logging is enabled only after all
fragments has been copied, just prior to enabling local
checkpoints.
(Bug#48474)
When using very large transactions containing many inserts,
ndbmtd could fail with Signal
11 without an easily detectable reason, due to an
internal variable being unitialized in the event that the
LongMessageBuffer
was overloaded. Now, the
variable is initialized in such cases, avoiding the crash, and
an appropriate error message is generated.
(Bug#48441)
See also Bug#46914.
A data node crashing while restarting, followed by a system restart could lead to incorrect handling of redo log metadata, causing the system restart to fail with Error while reading REDO log. (Bug#48436)
Starting a mysqld process with
--ndb-nodeid
(either as
a command-line option or by assigning it a value in
my.cnf
) caused the
mysqld to get only the corresponding
connection from the [mysqld]
section in the
config.ini
file having the matching ID,
even when connection pooling was enabled (that is, when the
mysqld process was started with
--ndb-cluster-connection-pool
set
greater than 1).
(Bug#48405)
The configuration check that each management server runs to verify that all connected ndb_mgmd processes have the same configuration could fail when a configuration change took place while this check was in progress. Now in such cases, the configuration check is rescheduled for a later time, after the change is complete. (Bug#48143)
When employing NDB
native backup to
back up and restore an empty NDB
table that used a non-sequential
AUTO_INCREMENT
value, the
AUTO_INCREMENT
value was not restored
correctly.
(Bug#48005)
ndb_config --xml
--configinfo
now indicates that
parameters belonging in the [SCI]
,
[SCI DEFAULT]
, [SHM]
, and
[SHM DEFAULT]
sections of the
config.ini
file are deprecated or
experimental, as appropriate.
(Bug#47365)
NDB
stores blob column data in a
separate, hidden table that is not accessible from MySQL. If
this table was missing for some reason (such as accidental
deletion of the file corresponding to the hidden table) when
making a MySQL Cluster native backup, ndb_restore crashed when
attempting to restore the backup. Now in such cases, ndb_restore
fails with the error message Table
table_name
has blob column
(column_name
) with missing parts
table in backup instead.
(Bug#47289)
In MySQL Cluster NDB 7.0, ndb_config and
ndb_error_reporter were printing warnings
about management and data nodes running on the same host to
stdout
instead of stderr
,
as was the case in earlier MySQL Cluster release series.
(Bug#44689, Bug#49160)
See also Bug#25941.
DROP DATABASE
failed when there
were stale temporary NDB
tables in
the database. This situation could occur if
mysqld crashed during execution of a
DROP TABLE
statement after the
table definition had been removed from
NDBCLUSTER
but before the
corresponding .ndb
file had been removed
from the crashed SQL node's data directory. Now, when
mysqld executes DROP
DATABASE
, it checks for these files and removes them
if there are no corresponding table definitions for them found
in NDBCLUSTER
.
(Bug#44529)
Creating an NDB
table with an
excessive number of large BIT
columns caused the cluster to fail. Now, an attempt to create
such a table is rejected with error 791 (Too many
total bits in bitfields).
(Bug#42046)
See also Bug#42047.
When a long-running transaction lasting long enough to cause
Error 410 (REDO log files overloaded) was
later committed or rolled back, it could happen that
NDBCLUSTER
was not able to release
the space used for the REDO log, so that the error condition
persisted indefinitely.
The most likely cause of such transactions is a bug in the application using MySQL Cluster. This fix should handle most cases where this might occur. (Bug#36500)
Deprecation and usage information obtained from
ndb_config --configinfo
regarding the PortNumber
and
ServerPort
configuration parameters was
improved.
(Bug#24584)
Disk Data: When running a write-intensive workload with a very large disk page buffer cache, CPU usage approached 100% during a local checkpoint of a cluster containing Disk Data tables. (Bug#49532)
Disk Data:
NDBCLUSTER
failed to provide a
valid error message it failed to commit schema transactions
during an initial start if the cluster was configured using the
InitialLogFileGroup
parameter.
(Bug#48517)
Disk Data: In certain limited cases, it was possible when the cluster contained Disk Data tables for ndbmtd to crash during a system restart. (Bug#48498)
See also Bug#47832.
Disk Data: Repeatedly creating and then dropping Disk Data tables could eventually lead to data node failures. (Bug#45794, Bug#48910)
Disk Data:
When a crash occurs due to a problem in Disk Data code, the
currently active page list is printed to
stdout
(that is, in one or more
ndb_
files). One of these lists could contain an endless loop; this
caused a printout that was effectively never-ending. Now in such
cases, a maximum of 512 entries is printed from each list.
(Bug#42431)nodeid
_out.log
Disk Data:
When the FileSystemPathUndoFiles
configuration parameter was set to an non-existent path, the
data nodes shut down with the generic error code 2341
(Internal program error). Now in such
cases, the error reported is error 2815 (File not
found).
Cluster Replication:
When expire_logs_days
was set,
the thread performing the purge of the log files could deadlock,
causing all binary log operations to stop.
(Bug#49536)
Cluster API:
When a DML operation failed due to a uniqueness violation on an
NDB
table having more than one
unique index, it was difficult to determine which constraint
caused the failure; it was necessary to obtain an
NdbError
object, then decode its
details
property, which in could lead to
memory management issues in application code.
To help solve this problem, a new API method
Ndb::getNdbErrorDetail()
is added, providing
a well-formatted string containing more precise information
about the index that caused the unque constraint violation. The
following additional changes are also made in the NDB API:
Use of NdbError.details
is now deprecated
in favor of the new method.
The NdbDictionary::listObjects()
method
has been modified to provide more information.
For more information, see
Ndb::getNdbErrorDetail()
,
The NdbError
Structure, and
Dictionary::listObjects()
.
(Bug#48851)
Cluster API:
When using blobs, calling getBlobHandle()
requires the full key to have been set using
equal()
, because
getBlobHandle()
must access the key for
adding blob table operations. However, if
getBlobHandle()
was called without first
setting all parts of the primary key, the application using it
crashed. Now, an appropriate error code is returned instead.
(Bug#28116, Bug#48973)
The mysql_real_connect()
C API
function only attempted to connect to the first IP address
returned for a hostname. This could be a problem if a hostname
mapped to multiple IP address and the server was not bound to
the first one returned. Now
mysql_real_connect()
attempts to
connect to all IPv4 or IPv6 addresses that a domain name maps
to.
(Bug#45017)
See also Bug#47757.
User Comments
Add your own comment.