2014/05/06

ビジネスモデル

どのようなビジネスモデルにも共通することであるが、ユーザーがお金を払うのは何らかの障壁があるときだ。その障壁を解決して「何かが出来ない状況」を打破するために、お金を払うのである
OSSではプロプラエタリソフトウェアのように、単純なライセンスの販売によって収益をあげるのは難しい。だが、たとえライセンス販売が出来ないとして も、そこにユーザーが存在する限り、ビジネスは存在する。ITシステムは年々複雑化の一途を辿るばかりであり、ソフトウェアの利用形態も複雑になるばかり である。そのため、ソフトウェア利用時における「障壁」のタイプも様々に多様化しているのである。すなわち、ソフトウェアに対するニーズが多様化している のである。その複雑化したニーズを捉えられなければ、如何にプロプラエタリライセンスによる収入があったとしても機会を逸してしまうことになるし、ニーズ を的確に捉えればOSSソフトウェアにも大いにチャンスがあるだろう。

2014/04/23

mysql 複数マスター multi-source replication

5.7からだよ。。
labs.mysql.com

■How does it work?
・Now we have the concept of communication channels. Each communication channel is a connection from slave to master to get the binary log events. That means we will have one IO_THREAD for each communication channel. We will need to run different “CHANGE MASTER” commands, one for each master, with the “FOR CHANNEL” argument that we will use to give a name to that channel
==>
CHANGE MASTER MASTER_HOST='something', MASTER_USER=... FOR CHANNEL="name_of_channel1";
CHANGE MASTER MASTER_HOST='something', MASTER_USER=... FOR CHANNEL="name_of_channel2";   

START slave for channel="name_of_channel1"
START slave for channel="name_of_channel2"

SHOW SLAVE STATUS FOR CHANNEL="name_of_channel1"\G
SHOW SLAVE STATUS FOR CHANNEL="name_of_channel2"\G

SHOW PROCESSLIST;

・he slave should have been configured first with the crash-safe feature of MySQL 5.6. That means that info usually included in master.info or relay-log.info should be on a table.
==>
    master_info_repository=TABLE;
    relay_log_info_repository=TABLE;

■key-words:
MySQL Multi-threaded Slave
MySQL Multi-source Replication

mysql binlog_row_image

In MySQL row-based replication, each row change event contains two images, a “before” image whose columns are matched against when searching for the row to be updated, and an “after” image containing the changes.
Normally, MySQL logs full rows (that is, all columns) for both the before and after images. However, it is not strictly necessary to include every column in both images, and we can often save disk, memory, and network usage by logging only those columns which are actually required.
For the before image, it is necessary only that the minimum set of columns required to uniquely identify rows is logged.
In the after image, it is necessary to log only the columns which have actually changed.



    full: Log all columns in both the before image and the after image.

    minimal: Log only those columns in the before image that are required to identify the row to be changed; log only those columns in the after image that are actually changed.

    noblob: Log all columns (same as full), except for BLOB and TEXT columns that are not required to identify rows, or that have not changed.
   
■条件付けだね。。
Setting this variable has no effect when the binary logging format is STATEMENT
じゃ。logging formatを「MIX」にして、
Optimized Row Based Replication

By only replicating those elements of the row image that have changed following INSERT, UPDATE and DELETE operations, replication throughput for both the master and slave(s) can be increased while binary log disk space, network resource and server memory footprint are all reduced.
スレーブとマスターのデータは、同じ順番であることが必要。。
その保証が難しい。

mysql GTID replication 考え方

Global Transaction ID
GTID = source_id:transaction_id
source_id==>the originating server. Normally, the server's server_uuid is used for this purpose
transaction_id==>a sequence number determined by the order in which the transaction was committed on this server

■A global transaction identifier (GTID) is a unique identifier created and associated with each transaction when it is committed on the server of origin (master). This identifier is unique not only to the server on which it originated, but is unique across all servers in a given replication setup. There is a 1-to-1 mapping between all transactions and all GTIDs.
==> all servers in a given replication setup?


これがあれば、いろいろ細かくコントロールできるよね。。
でも、それなりのハードルよりも、リスクだね。。
=>In MySQL 5.6.6 and later, this format is also used to supply the argument required by the START SLAVE options SQL_BEFORE_GTIDS and SQL_AFTER_GTIDS.

■GTIDs are always preserved between master and slave. This means that you can always determine the source for any transaction applied on any slave by examining its binary log. In addition, once a transaction with a given GTID is committed on a given server, any subsequent transaction having the same GTID is ignored by that server. Thus, a transaction committed on the master can be applied no more than once on the slave, which helps to guarantee consistency.
==>まぁ。。それはそうだね。。当たり前

■From the perspective of the database administrator or developer, GTIDs entirely take the place of the file-offset pairs previously required to determine points for starting, stopping, or resuming the flow of data between master and slave. This means that, when you are using GTIDs for replication, you do not need (or want) to include MASTER_LOG_FILE or MASTER_LOG_POS options in the CHANGE MASTER TO statement used to direct a slave to replicate from a given master; in place of these options, it is necessary only to enable the MASTER_AUTO_POSITION option introduced in MySQL 5.6.5.
==>これだよ!

■流れ
1A transaction is executed and committed on the master.
=>This transaction is assigned a GTID using the master's UUID and the smallest nonzero transaction sequence number not yet used on this server; the GTID is written to the master's binary log
2After the binary log data is transmitted to the slave and stored in the slave's relay log
the slave reads the GTID and sets the value of its gtid_next system variable as this GTID. This tells the slave that the next transaction must be logged using this GTID.
It is important to note that the slave sets gtid_next in a session context.
3The slave checks to make sure that this GTID has not already been used to log a transaction in its own binary log. If and only if this GTID has not been used, the slave then writes the GTID and applies the transaction (and writes the transaction to its binary log). By reading and checking the transaction's GTID first, before processing the transaction itself, the slave guarantees not only that no previous transaction having this GTID has been applied on the slave, but also that no other session has already read this GTID but has not yet committed the associated transaction. In other words, multiple clients are not permitted to apply the same transaction concurrently.
==>いちいちチェックする?
4Because gtid_next is not empty, the slave does not attempt to generate a GTID for this transaction but instead writes the GTID stored in this variable?that is, the GTID obtained from the master?immediately preceding the transaction in its binary log

■そのおかげさまで、メリットいろいろ。。悩みは?リスクは?
・Binary Log Group Commit
A significant improvement to replication performance has been enabled by the introduction of Binary Log Group Commit which groups writes to the Binlog,
rather than applying them one at a time.This enhancement significantly improves the performance of the replication master.
・Multi-Threaded Slaves
Slaves are better able to keep up with the master
・Optimized Row Based Replication
・The power of GTIDs are fully realized when used with the new MySQL replication utilities
mysqlfailover
mysqlrpladmin

・crash safe
The binlog and table data are transactionally consistent when using a transactional storage engine, and so the slave can automatically roll back replication to the last committed event before a crash, and resume replication without administrator intervention. Not only does this reduce operational overhead, it also eliminates the risk of data loss or corruption caused by the failure of a slave.
If a crash to the master causes corruption of the binary log, the server will automatically recover it to a position where it can be read correctly.


・チェックのメカニズムもあり、いろいろsmartでになるね

mysql Crash-safe Replication

■A common request is to have replication crash-safe in the sense that the replication progress information always is in sync with what has actually been applied to the database, even in the event of a crash. Although transactions are not lost if the server crashes, it could require some tweaking to bring the slaves up again.
=>リアルの情報を覚えなぁきゃ。。

in the latest MySQL 5.6 milestone development release, the replication team has implemented crash-safety for the slave by adding the ability of committing the replication information together with the transaction (see Figure 1)
 Also, some fixes were done on the master to ensure that it recovers correctly
=>いくらtogotherを言っても、(時刻上)現実的は不可能だようね。あくまでもトランザクションの特性を利用する ACID
しかもマスター側の協力も必要


If you're familiar with replication, you know that the replication information is stored in two files: master.info and relay-log.info. The update of these files are arranged so that they are updated after the transaction had been applied. This means that if you have a crash between the transaction commit and the update of the files, the replication progress information would be wrong. In other words, a transaction cannot be lost this way, but there is a risk that a transaction could be applied yet another time. The usual way to avoid this is to have a primary key on all your tables. In that case, a repeated update of the table would cause the slave to stop, and you would have to use SQL_SLAVE_SKIP_COUNTER to skip the transaction and get the slave up and running again. This is better than losing a transaction, but it is nevertheless a nuisance. Removing the primary key to prevent the slave from stopping will only solve the problem partially: it means that the transaction would be applied twice, which would both place a burden on the application to handle dual entries and also require that the tables to be cleaned regularly. Both of these approches require either manual intervention or scripting support to handle.
=>ミクロの話ね。
あの2つのファイルの更新は後だから、トランザクションが無くなうことはなく、
二回を適用するされしまう。まずいけど、テーブル毎にPrimaryKeyがあれば、そのことを検知できる。
SQL_SLAVE_SKIP_COUNTER to skip the transaction はいい。
でも、本当にテーブル毎にPrimaryKeyがあるの?
そのトランザクションの本体を確認しかない。じゃないと適用される部分がある、安心できないね。。


■Crash-safe masters
Two problems related to crash-safe replication has been fixed in the master, both of which could cause some annoyance when the master recovered.

    If the master crashed when a binary log was rotated, it was possible that some orphan binlog files ended up in the binary log index file. This was fixed in 5.1 but is also a piece in the pussle of having crash-safe replication.
    Writing to the binary log is not an atomic operation, and if a crash occured while writing to the binary log, there were a possibility of a partial event at the end of the binary log.

    Now, the master recovers from this by truncating the binary log to the last known good position, removing the partially written transaction and rolling back the outstanding transactions in the storage engines.
=>当たり前のことが。。前の不足を修正した。別にアピールする




■Crash-safe slaves
The MySQL replication team decided to instead implement crash-safety by moving the replication progress information into system tables.
This is a more flexible solution and has several advantages compared to storing the positions in the InnoDB transaction log。

・If the replication information and data is stored in the same storage engine, it will allow both the data and the replication position to be updated as a single transaction, which means that it is crash-safe.
・If the replication information and data is stored in different storage engines, but both support XA, they can still be committed as a single transaction.
・The replication information is flushed to disk together with the transaction data. Hence writing the replication information directly to the InnoDB redo log does not offer a speed advantage, but does not prevent the user from reading the replication progress information easily.
・The tables can be read from a normal session using SQL commands, which also means that it can be incorporated into such things as stored procedures and stored functions.
ーーーーーーーーーーーーーーー
テーブル:
    slave_master_info
Master_id        
Number_of_lines     1    
Master_log_name     2     Master_Log_File
Master_log_pos     3     Read_Master_Log_Pos
Host     3     Master_Host
User_name     4     Master_User
User_password     5    
Port     6     Master_Port
Connect_retry     7     Connect_Retry
Enabled_ssl     8     Master_SSL_Allowed
Ssl_ca     9     Master_SSL_CA_File
Ssl_capath     10     Master_SSL_CA_Path
Ssl_cert     11     Master_SSL_Cert
Ssl_cipher     12     Master_SSL_Cipher
Ssl_key     13     Master_SSL_Key
Ssl_verify_servert_cert     14     Master_SSL_Verify_Server_Cert
Heartbeat     15    
Bind     16     Master_Bind
Ignored_server_ids     17     Replicate_Ignore_Server_Ids
Uuid     18     Master_UUID
Retry_count     19     Master_Retry_Count
   
   
    slave_relay_log_info
Master_id       
Number_of_lines    1   
Relay_log_name    2    Relay_Log_File
Relay_log_pos    3    Relay_Log_Pos
Master_log_name    4    Relay_Master_Log_File
Master_log_pos    5    Exec_Master_Log_Pos
Sql_delay    6    SQL_Delay
ーーーーーーーーーーーーーーー

■エンジン
The new tables are created at installation using the mysql_install_db script, as usual, and the default engine for these tables are are the same as for all system tables: MyISAM. As you know MyISAM is not very transactional, so it is necessary to set this to use InnoDB instead if you really want crash-safety. To change the engine for these tables you can just use a normal ALTER TABLE.

slave> ALTER TABLE mysql.slave_master_info ENGINE = InnoDB;
slave> ALTER TABLE mysql.slave_relay_log_info ENGINE = InnoDB;
Note that this works for these tables because they were designed to allow any storage engine to be used for them, but it does not mean that you can change the storage engine for other system tables and expect it to work.
==>ここはちょっと気になった?大丈夫?MyISAMのシステムテーブルは大丈夫?


■マスター側:
START TRANSACTION;
INSERT INTO articles(user, title, body)
      VALUE (4711, 'Taming the Higgs Boson using Clicker Training', '....');
UPDATE users SET articles = articles + 1 WHERE user_id = 4711;
COMMIT;
スレーブ側:

START TRANSACTION;
INSERT INTO articles(user, title, body)
      VALUE (4711, 'Taming the Higgs Boson using Clicker Training', '....');
UPDATE users SET articles = articles + 1 WHERE user_id = 4711;
UPDATE mysql.slave_relay_log_info
   SET Master_log_pos = @@Exec_Master_Log_Pos,
       Master_log_name = @@Relay_Master_Log_File,
       Relay_log_name = @@Relay_Log_File,
       Relay_log_pos = @@Relay_Log_Pos
COMMIT;


the updates of the slave_master_info table is not updated with each processed event. Depending on the value of sync_master_info there are a few alternatives
If sync_master_info = 0
    In this case, the slave_master_info table is just updated when the slave starts or stops (for any reason, including errors), if the relay log is rotated, or if you execute a CHANGE MASTER command.
If sync_master_info > 0
    Then the slave_master_info table will be updated every sync_master_info event.
   
■問題:
==>the UPDATE statement at the end of each transaction. This means
every transaction needs to lock the same row! This makes it impossible to run
two transactions in parallel on the slave.   
my suggestion it would make sense to include the event group database as the second component for the primary key; this way it is possible to obtain current position for given database/apply thread, as well as globally, depending on the replication method.

Mysql Replication Event Checksum

Since the checksums are added to all events in the binary log on the master and transfered both over the network and written to the relay log on the slave, it is possible to track events corrupted events both because of hardware problems, network failures, and software bugs.
-->チェックする仕組みを入れた。
The checksum used is a CRC-32 checksum, more precisely ISO-3309, which is the one supplied with zlib.

==>特に気にしなくてもいい

mysql GTID replication セットアップ

■cold start

1If replication is already running, synchronize both servers by making them read-only.
=>SET @@global.read_only = ON;
2Stop both servers.
=>mysqladmin -uusername -p shutdown
3Restart both servers with GTIDs, binary logging, and slave update logging enabled, and with statements that are unsafe for GTID-based replication disabled. In addition, the servers should be started in read-only mode, and the slave SQL and I/O threads should be prevented from starting on the slave.
=>mysqld_safe --gtid_mode=ON --log-bin --log-slave-updates  --disable-gtid-unsafe-statements&
4Instruct the slave to use the master as the replication data source and to use auto-positioning, and then start the slave
=>
mysql> CHANGE MASTER TO
     >     MASTER_HOST = host,
     >     MASTER_PORT = port,
     >     MASTER_USER = user,
     >     MASTER_PASSWORD = password,
     >     MASTER_AUTO_POSITION = 1;
     START SLAVE;
5Disable read-only mode on both servers, so that they can once again accept updates.
=>SET @@global.read_only = OFF;


・Use the mysql client to import a dump file created with mysqldump. Use the --master-data option to include binary logging information and --set-gtid-purged (available in MySQL 5.6.9 and later) to AUTO (the default) or ON, to include information about executed transactions
・Stop the slave, copy the contents of the master's data directory to the slave's data directory, then restart the slave.
・Injecting empty transactions
==>
SET GTID_NEXT='aaa-bbb-ccc-ddd:N';

BEGIN;
COMMIT;

SET GTID_NEXT='AUTOMATIC';


==>よくわかない。。。試しないと。。。

制約:
・the storage engine must be transactional
・CREATE TABLE ... SELECT is not supported when using GTID-based replication.
・sql_slave_skip_counter is not supported when using GTIDs



■simple-setup
==>初期段階でこれはいい。設定だけでSLAVEは自動的にMASTERを追いかける。



■MASTERのSnapshotを取り入れ、その分の時間を節約する。

1DUMP
★MASTERのDUMPを取る
mysqldump --master-data=ON --set-gtid-purged=ON

--master-data
==>Use this option to dump a master replication server to produce a dump file that can be used to set up another server as a slave of the master.
The --master-data option automatically turns off --lock-tables. It also turns on --lock-all-tables, unless --single-transaction also is specified, in which case, a global read lock is acquired only for a short time at the beginning of the dump (see the description for --single-transaction). In all cases, any action on logs happens at the exact moment of the dump

In all cases, any action on logs happens at the exact moment of the dump.

★SLAVEはrestoreする
mysql

2ファイルコピー
Stop the slave, copy the contents of the master's data directory to the slave's data directory, then restart the slave.
=>auto.cnf、mysql.sock、mysql.pidを削除してから。。

3mysqlbinlogの方法

mysql multi-threaded slaves


Here are three key variables that you need to set to achieve maximum performance with multi-threaded slaves:

binlog-format=STATEMENT
relay-log-info-repository=TABLE
master-info-repository=TABLE

he variable slave_parallel_workers should be set to equal the number of schemas that you have

■うん。。
MySQLのレプリケーションを担当しているスレッドは、IOスレッドとSQLスレッドの2つだった。IOスレッドがマスタからバイナリログをとってきて、SQLスレッドがバイナリログ内に記録されたSQL文を実行しデータを反映していくのが従来の動作だった。新しいマルチスレッドレプリケーションでは、このIOスレッドとSQLスレッドの間に新しくSQLスレッドコーディネータを置く。
さらにSQLスレッドは、workerとして複数のスレッドに分割される。SQLスレッドコーディネータはSQLスレッドからバイナリログを受け取り、どのデータベースに対する更新なのかを判断して、それぞれのデータベースを担当するworkerに振り分ける。workerは受け取ったSQLを実行し、データベースを更新していく。
Interaction between Coordinator and Workers follow to the
producer-consumer model.



MTS is configurable with just few options:

  --slave-parallel-workers=#
                      Number of worker threads for executing events in parallel
  --slave-checkpoint-period=#(default:300)
                      Gather worker activities to Update progress status of
                      Multi-threaded slave and flush the relay log info to disk
                      after every #th milli-seconds;
  --slave-checkpoint-group=#(default:512)
                      Maximum number of processed transactions by
                      Multi-threaded slave before a checkpoint operation is
                      called to update progress status;
                      Multi-threaded slave and flush the relay log info to disk
    ==>
    The second and third options checkpoint period and group count provide
    a control for how often to update `mysql`.`slave_relay_log_info` by
    Coordinator thread.                 
                     
  --slave-pending-jobs-size-max=#(default:16M)
                      Max size of Slave Worker queues holding yet not applied
                      events.The least possible value must be not less than the
                      master side max_allowed_packet;
===>The fourth simply constraints RAM usage to not consume
memory with replication events over some threshold.



■できない時、 MTS temporarily
switches to the sequential applying
自動的に変更する

  START SLAVE MTS_UNTIL_AFTER_GAPS

select * `mysql.slave_worker_info`

2014/04/18

play framework 考える方

■full-stack? framework
Being ‘full-stack’ is not just a question of functionality, which may already
exist as a collection of open-source libraries. After all, what’s the point of a
framework if these libraries already exist and already provide everything you need
to build an application? The difference is that a full-stack framework also provides
a documented pattern for using separate libraries together in a certain way, and
therefore confidence that a developer can make the separate components work
together. Without this, you never know whether you are going to end up with two
incompatible libraries, or a badly-designed architecture.

the common tasks are directly supported in a simple way, which saves you
time.

■J2EE and Play Framework?J2EEの設計思想
階層が多く、開発の効率は悪い。でもアキの方が楽
・The Servlet API was originally intended to be an end-user API for web
developers, using Servlets (the name for controller Java classes), and JavaServer
Pages (JSP) view templates. When new technologies eventually superceded JSP,
they were layered on top, instead of eventually being folded back into Java EE,
either as updates to the Servlet API or as a new API. With this approach, the
Servlet API becomes an additional layer that makes it harder to debug HTTP
requests. This may keep the architects happy, but at the cost of developer
productivity.
・JSF focuses on components and server-side state, which also seemed like a good
idea, and gave developers powerful tools for building web applications
it turned out that the resulting complexity and the mismatch with HTTP
itself made JSF hard to use productively.

Play provides a redesigned web stack that
doesn’t use the Servlet API and works better with HTTP and the web

■ここは違う?まだ実感出来ていないが。。。
Web frameworks for web developers are different. They embrace HTTP and
provide APIs that use HTTP’s features instead of trying to hide HTTP
, in the same
way that web developers build expertise in the standard web technologies —
HTTP, HTML, CSS and JavaScript — instead of avoiding them.

In the past, none of these web frameworks were written in Java, because the
Java platform’s web technologies failed to emphasise simplicity, productivity and
usability. This is the world that started with Perl (not Lisp as some might assume),
was largely taken over by PHP, and in more recent years has seen the rise of Ruby
on Rails.

■ほしい物
・simplicity comes from making it easy to do simple things in a
few lines of code
・Productivity starts with being able to make a code change, reload the web page
in the browser, and see the result

■HelloWorld
play new "HelloWorld"
cd HelloWord
play run
=>http://localhost:9000/.

app/controllers/Application.java
==>エラーがあれば、You get a friendly compilation error。エラー表示画面

conf/application.con
==>DB接続
conf/routes.
==>GET /hello controllers.Application.hello(name:String)
template
==>app/views/hello.scala.html 言語
==>
return ok(views.html.hello.render("hello "+name +"! do you need something?"));

 ■ORM
JPA has more or less become the default persistence API in the Java ecosystem,
particularly in the JEE world. However, while Play doesn’t prevent you from using
JPA and/or Hibernate (or any other persistence solution, for that matter), it isn’t the
default solution. This is because a big part of the JPA spec is based on managing
entity state across e.g. requests, which is something Play has no use for, due to its
stateless nature 

sessionless API.
‘refresh an entity’, or what the difference between and ‘attached’ entities and
‘detached’ entities. 




Play is a rails inspired MVC framework built on the JVM. It has APIs for both Scala and Java and uses class reloading to quickly surface changes in the browser. A nice feature is the templating system that uses Scala to generate html. A template gets compiled to a Scala object and can be called from controllers, with the input type checked.

2014/04/16

mysql delete truncate

★テーブルに含まれる全データを削除する場合:
TRUNCATE TABLE tbl_name;
DELETE FROM tbl_name;


2つの違いは「DELETE FROM tbl_name」がデータを1つ1つ削除するのに対して「TRUNCATE TABLE tbl_name」はいったん
「DROP TABLE tbl_name」を行った上で再度同じテーブルを作成する点にある
・データを1つ1つ削除するよりもテーブルそのものを削除して作り直すほうが多くの場合高速
・AUTO_INCREMENTのクリア。


テーブルがInnoDB型の場合で、外部参照の親テーブルとなっている場合には「TRUNCATE TABLE tbl_name」を実行すると「DELETE FROM tbl_name」を使ってデータの削除が行われます。
この場合、子テーブルにおいてON DELETE句が呼び出されます。MyISAM型の場合にはテーブルの削除を使ってデータが削除されるためて子テーブルのON DELETE句は呼び出されません。
AUTO_INCREMENTが設定されているカラムが含まれていた場合、TRUNCATE文を使って全データの削除が行われるとAUTO_INCREMENTの値が初期化されます。
これはInnoDB型で外部参照の親テーブルとなっている場合も同じように初期化されます。