Free To Feel

Heading to entrepreneur.


Joshua Chi
Github

    How And When To Use Mysql Federated Storage Engine

    Learned from mixi update that they are using mysql federated engine for db partitioning.

    Use FEDERATED TABLE from MySQL 5.
    Or do SELECT twice which is faster than using FEDERATED TABLEs
    

    Mysql Official document didn't talk too much about how and when to use federated engine. You can learned more from Mysql Federated storage engine 1 and Mysql Federated storage engine 2.

    Finally I find some clues in Mysql Federated storage engine 2:

    Strengths:
    
    For example, pretend you have a set of distributed servers working on small parts of a large task, and their results need to be merged back together when done without conflict. Many solutions to this problem involve modulo arithmetic for generating primary keys. This could be a good use of a FEDERATED table: just federate one central table on all the servers, have the processes INSERT into the table, and they’ll get non-conflicting primary key numbers. That’s a trivially easy way to coordinate distributed resource requests.
    
    The way it lets you mis-define tables holds great potential. For example, Giuseppe Maxia has already noted that you can define a FEDERATED table against a view. Views don’t have indexes (yet), but that shouldn’t stop you from telling the engine it does! That way, your WHERE clauses are sent through to the remote server unharmed, where the view can execute GROUP BY queries and the like. Giuseppe even outlines a way to get the remote server to execute arbitrary commands via a FEDERATED table!
    
    What about combinations with replication, triggers and so forth? There must be many more cool hacks waiting to be discovered.
    
    

    And in Accessing Distributed Data with the Federated Storage Engine, you can find a simple demo that 'merge' two partioned tables into one by creating a view. Although, this is not what I expected. It will still make sense in some cases.

    Still have some questions about how the performance looks like when use view? And what's the limitation of rows or table size when using federated storage engine?

    Network Sharing Between Ubuntu And IPhone4(China Unicom Contract Version)

    Find a solution from Internet that can make network sharing between Ubuntu And IPhone4.

    sudo add-apt-repository ppa:pmcenery/ppa
    
    sudo apt-get update
    
    sudo apt-get install gvfs ipheth-dkms ipheth-utils
    

    If you can pass all these steps, and then restart your machine. Bingo!

    You can find more information here.

    Restore Lost Commits After Git Hard Reset

    Sometimes, you will make mistakes. But when someone told you there still is a hope that if you try xxxxx. What feeling will you have?

    The following might be one solution after 'git reset --hard' and realized you made a mistake:

      $ git reflog
      $ git merge 7c61179
    

    Check the blog post here.

    Replace MySql Table with NoSQL DB

    Backgroud:

    I am not nosql fans. I don't have the habit to replace whole mysql db with nosql db. But if a mysql table can be replaced with nosql db and expensive queries are generated againest this table, I will give a try of nosql DB.

    In our product, we have a unique user login log table to know how many users logined in a specify date. Here is the table structure:

      unique_login_logs: 
        _attributes:  {phpName: UniqueLoginLog}
        id:
        user_id: { type: integer }
        date: { type: date, index: true }
        created_at:
        _uniques:
          unique_date_type: [user_id, date]
    

    The expensive query looks like:

    select date, count(`user_id`) from unique_login_logs group by date order by date desc;'
    

    And the other related operations will be save an entry into this table or check the user entry for a specify date is in the table or not. Nothing more.

    Choose NoSQL DB from wiki page:

    ...
    Key-value cache in RAM
    
        Citrusleaf database
        memcached
        Oracle Coherence
        Redis
        Tuple space
        Velocity
    
    Key-value stores implementing the Paxos algorithm
    
        Keyspace
    
    Key-value stores on disk
    
        BigTable
        CDB
        Citrusleaf database
        Dynomite
        Keyspace
        membase
        Memcachedb
        Redis
        Tokyo Cabinet
        TreapDB
        Tuple space
        MongoDB
    ...
    

    For my feature request, I need a disk stored key-value system. I just spend 1 minute to take a look the MongoDB features. It meet my feature requirements. That's enough. If you know more about the other nosql DBs, of course you should choose the one which suit yours best.

    Try MongoDB

    Step 1: Install MongoDB(http://www.mongodb.org/display/DOCS/Quickstart+Unix);

    Step 2: Simple demo testing:

    create a new db:

    use mydb;
    

    Save an entry:

    db.uniqueLoginLog.save({date: '2011-06-09', user_id: 101});
    db.uniqueLoginLog.save({date: '2011-06-09', user_id: 111});
    db.uniqueLoginLog.save({date: '2011-06-09', user_id: 102});
    ..
    db.uniqueLoginLog.save({date: '2011-06-11', user_id: 100});
    db.uniqueLoginLog.save({date: '2011-06-11', user_id: 101});
    db.uniqueLoginLog.save({date: '2011-06-11', user_id: 102});
    

    Group entries by date:

    db.uniqueLoginLog.group(
               {key: { date:true },
                reduce: function(obj,prev) { prev.count ++; },
                initial: { count: 0 }
                });
                
    

    The output looks like:

    [
        {
            "date" : "2011-01-01",
            "count" : 3
        },
        {
            "date" : "2011-01-02",
            "count" : 3
        },
        {
            "date" : "2011-01-03",
            "count" : 1
        },
        {
            "date" : "2011-01-04",
            "count" : 1
        }
    ]
    

    With the output, I can do the order in client.

    Install PHP Mongo Extension

    You can install Mongo PHP extension with pecl command line or install it manually

    > pecl search mongo
    Retrieving data...0%
    Matched packages, channel pecl.php.net:
    =======================================
    Package Stable/(Latest) Local
    mongo   1.1.4 (stable)        MongoDB database driver
    
    > pecl install mongo
    

    Get stored items with php function

    PHP code:

    $m = new Mongo();
    
    $db = $m->mydb;
    
    $collection = $db->uniqueLoginLog;
    
    
    $obj = array('date'=> '2011-06-09', 'user_id'=> 101);
    $collection->insert($obj);
    
    
    
    $keys = array("date" => 1);
    $initial = array("count" => 0);
    $reduce = "function(obj,prev) { prev.count ++; }";
    $g = $collection->group($keys, $initial, $reduce);
    
    var_dump($g);
    
    

    Output looks like:

    array(4) {
      ["retval"]=>
      array(3) {
        [0]=>
        array(2) {
          ["date"]=>
          string(10) "2011-06-09"
          ["count"]=>
          float(17)
        }
        [1]=>
        array(2) {
          ["date"]=>
          string(10) "2011-06-11"
          ["count"]=>
          float(10)
        }
        [2]=>
        array(2) {
          ["date"]=>
          string(10) "2011-06-10"
          ["count"]=>
          float(7)
        }
      }
      ["count"]=>
      float(34)
      ["keys"]=>
      int(3)
      ["ok"]=>
      float(1)
    }
    

    Deploy to production?

    Until now, the demo is still a toy. If you want to deploy it to production, there are more work need to to be done. Take a look at the admin zone

    For my feature implemention, I will consider Replication next step.

    Summary:

    This is not a toturial about how to use MongoDB. It is just a example that can use nosql db in your production.

    Reference:

    When To Use Combine Index

    There are already a lot of posts which gave a introduction of what index merge is in MySQL. One of them is:

    Playing with MySQL's index merge

    Today I just noticed the combined index also can be affected by the single column index. For example:

    1. Let's say we have a table named table1 which have 100,000 items. The size of this table is 35MB.

    Table structure:

    CREATE TABLE `table1` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `name` varchar(20) DEFAULT NULL,
      `status1` int(11) DEFAULT NULL,
      `status2` int(11) DEFAULT NULL,
      `status3` int(11) DEFAULT NULL,
      `status4` int(11) DEFAULT NULL,
      `status5` int(11) DEFAULT NULL,
      `status6` int(11) DEFAULT NULL,
      PRIMARY KEY (`id`),
      UNIQUE KEY `table1_name_unique` (`name`),
      KEY `table1_status1` (`status1`),
      KEY `table1_status2` (`status2`),
      KEY `table1_status3` (`status3`),
      KEY `table1_status4` (`status4`),
      KEY `table1_status5` (`status5`),
      KEY `table1_status6` (`status6`),
      KEY `table1_combine_index` (`status1`,`status2`,`status3`,`status4`,`status5`,`status6``)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    

    We have a query need to select items from table1 table with checking all the status columns in where clause. Let's do a explain:

    explain select * from table1 where status1=1 and status2=1 and status3=1 and status4=1 and status5=1 and status6=1  
    

    Explain result:

      select_type: SIMPLE
      table: table1
      type: index_merge
      possible_keys: PRIMARY,table1_status1,table1_status2,table1_status3,table1_status4,table1_status5,table1_status6,table1_combine_index
      key: table1_status1,table1_status2,table1_status3,table1_status4,table1_status5,table1_status6
      key_len: 1,1,4,4,4,5
      refs: NULL
      rows: 1955
      Extra: Using intersect(table1_status1,table1_status2,table1_status3,table1_status4,table1_status5,table1_status6); Using where
    

    The select query takes 3.5ms by intersecting indexes.

    What happened if we force index to use table1_combine_index:

    explain select * from table1 force index (table1_combine_index)  where status1=1 and status2=1 and status3=1 and status4=1 and status5=1 and status6=1  
    

    Explain result:

      select_type: SIMPLE
      table: table1
      type: ref
      possible_keys: table1_combine_index
      key: table1_combine_index
      key_len: 9
      refs: const,const
      rows: 62590
      Extra: Using where
    

    The select query takes 10.5ms by using combine index.

    In this example, intersecting index is more useful than combine index. There is an article give a more detailed explanation why it works like this:

    Multi Column indexes vs Index Merge

    1. Now it's time to make this example a little complicated.

    We have another table named table2, the structure looks like:

    CREATE TABLE `table2` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `table1_id` int(11) NOT NULL,
      `column1` datetime NOT NULL,
      `column2` varchar(255) NOT NULL,
      `column3` text,
      `column4` tinyint(4) NOT NULL,
      `column5 ` tinyint(4) NOT NULL,
      `column6 ` tinyint(4) NOT NULL,
      `column7` varchar(255) NOT NULL,
      PRIMARY KEY (`id`),
      UNIQUE KEY `column7 ` (`column7 `),
      KEY `table2_FI_1` (`table1_id `)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8
    

    There are 20,000 rows in table2, with size 11MB. We have a feature request need to join table1 with table2 and sort by table2's id.

    SELECT sql_no_cache * FROM table2 LEFT JOIN table1 ON (table2.table1_id = table1.id) WHERE table1.status1=1 and table1.status2=1 and table1.status3=1 and table1.status4=1 and table1.status5=1 and table1.status6=1 ORDER BY table2.id DESC LIMIT 10;
    

    It takes 3s to finish the query. The explain tells you:

      select_type: SIMPLE
      table: table1
      type: index_merge
      possible_keys: PRIMARY,`table1_status1`,`table1_status2`,`table1_status3`,`table1_status4`,`table1_status5`,`table1_status6`,`table1_combine_index`
      key: `table1_status1`,`table1_status2`,`table1_status3`,`table1_status4`,`table1_status5`,`table1_status6`
      key_len: 1,1,4,4,4,5
      refs: NULL
      rows: 1955
      Extra: Using intersect(`table1_status1`,`table1_status2`,`table1_status3`,`table1_status4`,`table1_status5`,`table1_status6`); Using where
      
      ---
      select_type: SIMPLE
      table: table2
      type: ref
      possible_keys: table2_FI_1
      key: table2_FI_1
      key_len: 4
      refs: table1.id
      rows: 1
      Extra:  
    

    Now I want to remove these single column indexes in table1 to see what will happen:

      select_type: SIMPLE
      table: table2
      type: index
      possible_keys: table2_FI_1
      key: PRIMARY
      key_len: 4
      refs: 
      rows: 10
      Extra:
      
      ---
      
      select_type: SIMPLE
      table: table1
      type: eq_ref
      possible_keys: PRIMARY,table1_combine_index
      key: PRIMARY
      key_len: 4
      refs: table2.table1_id
      rows: 1
      Extra: Using where
    

    After removing the single column indexes in table 1, this select query only takes 2.4ms. To explain this, we can simply treat (status1,status2,status3,status4,status5,status6) as one column in table1. I think the single columns indexes affected the 'join' performance. MySQL need to use these single indexes to filter the table1 firstly and then make a join with table2.

    Summary:

    Combine index is not always better than index merge. To know which one is better, explain your query, analyze it and adjust the index.

    Nginx, PHP-FPM, 404 error after upgrading to Ubuntu 11.04

    After upgrading from ubuntu 10.10 to ubuntu 11.04 my local site didn't work anymore. It reponsed with a 404 page.

    After researcing for a while, I found the root should be put outside the location property: Before:

    server {
      server_name  localhost;
    
      location / {
            root   /var/www;
            index  index.php index.html index.htm;
        }
     ....
    }
    
    

    After:

    server {
    
      root   /var/www;
      index  index.php index.html index.htm;
      server_name  localhost;
    
     ....
    }
    

    Now it works, but have no idea what has been changed since Ubuntu upgrade.

    Solve Wireless Issue When Upgrading From 10.10 to 11.04

    After upgrading from ubuntu 10.10 to ubuntu 11.04 the wireless stop working.

    The first step, I was using ubuntu trouble shooting to detect the issue.

    1. Detail whether there are software or hardware blocks on your rf devices. To get a list of devices and their hardware and software status, try "sudo rfkill list".
    ~$ sudo rfkill list
    0: hp-wifi: Wireless LAN
        Soft blocked: yes
        Hard blocked: yes
    

    Unblock your devices by "sudo rfkill unblock 0".

    1. Check for Device Recognition for my HP laptop(amd64)
    ~$ sudo lshw -C network
      *-network UNCLAIMED     
           description: Network controller
           product: BCM4311 802.11a/b/g
           vendor: Broadcom Corporation
           physical id: 0
           bus info: pci@0000:30:00.0
           version: 01
           width: 32 bits
           clock: 33MHz
           capabilities: pm msi pciexpress bus_master cap_list
           configuration: latency=0
           resources: memory:c8000000-c8003fff
      *-network
           description: Ethernet interface
           product: NetXtreme BCM5788 Gigabit Ethernet
           vendor: Broadcom Corporation
           physical id: 1
           bus info: pci@0000:02:01.0
           logical name: eth0
           version: 03
           serial: 00:17:08:37:6a:64
           size: 100Mbit/s
           capacity: 1Gbit/s
           width: 32 bits
           clock: 66MHz
           capabilities: pm vpd msi bus_master cap_list ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt 1000bt-fd autonegotiation
           configuration: autonegotiation=on broadcast=yes driver=tg3 driverversion=3.116 duplex=full firmware=5788-v3.26 ip=192.168.1.103 latency=64 link=yes mingnt=64 multicast=yes port=twisted pair speed=100Mbit/s
           resources: irq:23 memory:d4000000-d400ffff
    

    Here I can see 'tg3' driver was using. And by checking the loaded module the tg3 is there.

    ~$ sudo lsmod
    

    If your driver was not loaded, you can load it by "sudo modprobe "

    1. Identify PCI devices with PCI ID.
    ~$ sudo lspci -v | grep Ethernet
    02:01.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5788 Gigabit Ethernet (rev 03)
    
    ~$ sudo lspci -v | grep Network
    30:00.0 Network controller: Broadcom Corporation BCM4311 802.11a/b/g (rev 01)
    
    1. Check the addtional drivers, I found "Broadcom STA wireless driver" was using.

    Everything looks good until now. Driver was working and was recognized. So I guess the issue might be the compatibility with new kernel?

    1. With the help from google.

    i). Someone solve it by install linux-backports-modules-wireless-maverick-generic. It is interesting to know what backports is.

    Unfortunately it didn't work for me.

    ii). So I tried the solution in this thread by disable my "Broadcom STA wireless driver" and install b43.

    Removing "bcmwl-kernel-source" by using Synaptic Package Manager
    Then installing "firmware-b43-installer" and "b43-fwcutter" again by Synaptic Package Manager. 
    

    At the end of step e, and by rebooting laptop, the wifi issue was fixed. If you still has the issue, I suggest check the trouble shooting tutorial again. It is really helpful.

    Reference:

    1.http://linuxfans.keryxproject.org/?page_id=27

    2.http://www.ubuntumini.com/2009/11/broadcom-wireless-driver-fix-in-karmic.html