Mysql split column string into rows

A MySQL recipe, that you can use to split a cell value by a known separator into different rows, in some way similar to the PHP explode function or split in PERL.

To turn this:

id value
1 4,5,7
2 4,5
3 4,5,6
…. ….

Into this

id value
1 4
1 5
1 7
2 4
2 5
3 4
3 5
3 6
…. ….

You can simply write and call a stored procedure

DELIMITER $$

DROP PROCEDURE IF EXISTS explode_table $$
CREATE PROCEDURE explode_table(bound VARCHAR(255))

  BEGIN

    DECLARE id INT DEFAULT 0;
    DECLARE value TEXT;
    DECLARE occurance INT DEFAULT 0;
    DECLARE i INT DEFAULT 0;
    DECLARE splitted_value INT;
    DECLARE done INT DEFAULT 0;
    DECLARE cur1 CURSOR FOR SELECT table1.id, table1.value
                                         FROM table1
                                         WHERE table1.value != '';
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

    DROP TEMPORARY TABLE IF EXISTS table2;
    CREATE TEMPORARY TABLE table2(
    `id` INT NOT NULL,
    `value` VARCHAR(255) NOT NULL
    ) ENGINE=Memory;

    OPEN cur1;
      read_loop: LOOP
        FETCH cur1 INTO id, value;
        IF done THEN
          LEAVE read_loop;
        END IF;

        SET occurance = (SELECT LENGTH(value)
                                 - LENGTH(REPLACE(value, bound, ''))
                                 +1);
        SET i=1;
        WHILE i <= occurance DO
          SET splitted_value =
          (SELECT REPLACE(SUBSTRING(SUBSTRING_INDEX(value, bound, i),
          LENGTH(SUBSTRING_INDEX(value, bound, i - 1)) + 1), ',', ''));

          INSERT INTO table2 VALUES (id, splitted_value);
          SET i = i + 1;

        END WHILE;
      END LOOP;

      SELECT * FROM table2;
    CLOSE cur1;
  END; $$

Then you simply call it

CALL explode_table(',');
There it is the bare bones. From here it’s simple to adapt and build to your own needs, like adding some kind of filter parameter, order, etc… if your main interface to Mysql is PHPMyAdmin (as of now) forget it, its rubish with this procedures queries, you can use own MySQL GUI – MySQL Workbench – to interface with, or rely on the old CLI ‘mysql’ command, just put the stored procedure definition in a file and load it with a redirect:

mysql -u username -p -D databasename < procedure_definition_file.txt

Also remember:

  • if backups are made with mysqldump, use the –routines switch so the stored procedure definition goes in the dumps.
  • works mysql >= 5.0 only
  • performance, normalization and concurrency – this is not the correct way to do a many to many relationship with a RDBS, you should use a relationship table, and joins to work with it.
  • OK, so your project manager/marketing/boss changed the game rules at the very last moment, and to implement it correctly you must rework a lot of code, i understand :) but even then enter this road at your own peril.
This entry was posted in Geek Stuff and tagged , , , , , . Bookmark the permalink.

17 Responses to Mysql split column string into rows

  1. It`s a nice funktion, but how to store values in additional fields with one id.
    Like this:

    id value1 value2 value3
    1 4 5 7
    2 4 5
    3 4 5 6

  2. Marco says:

    @Tobias

    if you have this kind of table layout, you know in advance the number of field values, so i think you just need a SQL union query.

    This is the SQL definition of your example:

    CREATE TABLE table1 (
    id INT NOT NULL ,
    value1 INT NULL ,
    value2 INT NULL ,
    value3 INT NULL ,
    PRIMARY KEY ( `id` )
    );

    INSERT INTO table1 VALUES ('1', '4', '5', '7');
    INSERT INTO table1 VALUES ('2', '4', '5', null);
    INSERT INTO table1 VALUES ('3', '4', '5', '6');

    And the union query (i filtered out all the null values with a sub-query) as:

    SELECT * FROM (
    SELECT id, value1 AS value FROM table1
    UNION
    SELECT id, value2 AS value FROM table1
    UNION
    SELECT id, value3 AS value FROM table1
    ORDER BY id) AS t1
    WHERE value IS NOT NULL

    • Can you explain to me, what `bound` does?

      • Marco says:

        @Tobias

        “bound” is abbreviation for boundary (as in value boundary), it’s the char(s) that you want to explode by. In my example i have the values split by commas, as in “1,2,3″ but i could have another char (ex: a pipe) “1|2|3″ and the function works fine, just call it with the correct delimiter:

        CALL explode_table(‘,’); // for comma separated values
        CALL explode_table(‘|’); // for pipe separated values
        etc…

        ps – just didn’t use”delimiter” because it’s a SQL word.

  3. For a better understanding, please take a look at this:

    I try to split the values from `datafield`:

    VALUES:
    88044,A,56

    Therefore i wrote this:

    BEGIN

    DECLARE id INT DEFAULT 0;
    DECLARE value varchar(64);
    DECLARE occurance INT DEFAULT 0;
    DECLARE i INT DEFAULT 0;
    DECLARE splitted_value varchar(64);
    DECLARE done INT DEFAULT 0;
    DECLARE cur1 CURSOR FOR SELECT interface.id, interface.datafield
    FROM interface
    WHERE interface.datafield != ”;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

    DROP TABLE if exists OrderTask;

    OPEN cur1;
    read_loop: LOOP
    FETCH cur1 INTO id, value;
    IF done THEN
    LEAVE read_loop;
    END IF;

    SET occurance = (SELECT LENGTH(value)
    – LENGTH(REPLACE(value, bound, ”))
    +1);
    SET i=1;
    WHILE i <= occurance DO
    SET splitted_value =
    (SELECT REPLACE(SUBSTRING(SUBSTRING_INDEX(value, bound, i),
    LENGTH(SUBSTRING_INDEX(value, bound, i – 1)) + 1 ) ,',', ''));

    INSERT INTO OrderSubstr(interface_id, order_id) VALUES (id, splitted_value);
    SET i = i + 1;

    END WHILE;
    END LOOP;

    Create Table OrderTask AS
    (SELECT OrderSubstr.interface_id, OrderSubstr.order_id, OrderSubstr.order_status, OrderSubstr.order_total
    FROM OrderSubstr LEFT JOIN interface ON interface.id = OrderSubstr.interface_id
    WHERE interface.status = '-' AND interface.acceptor = 'erp' AND interface.actiontype = 2);
    DELETE FROM OrderSubstr WHERE id is not NULL;

    CLOSE cur1;
    END;

    The Result is:

    interface_id | order_id | order_status | order_total 20533949 | 88044 | NULL | NULL
    20533949 | A | NULL | NULL
    20533949 | 56 | NULL | NULL

    My problem is to split data into one row. The result should be:

    interface_id | order_id | order_status | order_total 20533949 | 88044 | A | 56

    Splitting in one row by using the loop, thats it!

  4. Bala says:

    Thanks for your good explanation and example. Its saves my tones n tones of time. I have more columns instead two columns. Lets take 5 columns with multiple values in each Row. I wanna a output like a combination. Is it possible??? Can u guide me???

    • Marco says:

      @Bala

      just use CONCAT or CONCAT_WS to work out the n columns into 2, ex:
      SELECT id, CONCAT(col_1, col_2, col_3, col_4) AS values FROM table 1

  5. Marcus says:

    Thanks for sharing this. Saved me quite some time.

    I had some problems with UTF-8 / latin1 charater mixery issues in my use case. The LENGTH() – Command was calculating a wrong length if there was an ‘ü’ or ß or some other special character in the string.

    So I changed it to CHAR_LENGTH() which gives you back the real length of a string even with special characters and encoding issues inside ist.

    If someone runs into problems with this script that words get chopped of than give this CHAR_LENGTH() a try.

  6. I solved the problem with the help of your ideas. Thanks a lot. I use a case statement while doing the loop. Now the script takes every value and splits it by reading pipes into several fields.

    DELIMITER $$

    DROP PROCEDURE IF EXISTS customers $$
    CREATE PROCEDURE customers(bound VARCHAR(255))

    BEGIN

    DECLARE id INT DEFAULT 0;
    DECLARE value TEXT;
    DECLARE occurance INT DEFAULT 0;
    DECLARE i INT DEFAULT 0;
    DECLARE splitted_value varchar(64);
    DECLARE done INT DEFAULT 0;
    DECLARE cur1 CURSOR FOR SELECT ERP.id, ERP.datafield
    FROM ERP
    WHERE ERP.datafield != ” AND actiontype = 1;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
    — DELETE FROM CustomerTask WHERE interface_id is not NULL;

    OPEN cur1;
    read_loop: LOOP
    FETCH cur1 INTO id, value;
    IF done THEN
    LEAVE read_loop;
    END IF;

    SET occurance = (SELECT LENGTH(value)
    – LENGTH(REPLACE(value, bound, ”))
    +1);
    SET i=1;
    WHILE i <= occurance DO
    SET splitted_value =
    (SELECT REPLACE(SUBSTRING(SUBSTRING_INDEX(value, bound, i),
    LENGTH(SUBSTRING_INDEX(value, bound, i – 1)) + 1), '|', ''));

    CASE i
    WHEN 1 THEN
    INSERT INTO SplitCustomers(interface_id, user_info_id) VALUES (id, splitted_value);
    WHEN 2 THEN
    UPDATE SplitCustomers SET user_id = splitted_value WHERE interface_id = id;
    WHEN 3 THEN
    UPDATE SplitCustomers SET address_type = splitted_value WHERE interface_id = id;
    WHEN 4 THEN
    UPDATE SplitCustomers SET company = splitted_value WHERE interface_id = id;
    WHEN 5 THEN
    UPDATE SplitCustomers SET title = splitted_value WHERE interface_id = id;
    WHEN 6 THEN
    UPDATE SplitCustomers SET last_name = splitted_value WHERE interface_id = id;
    WHEN 7 THEN
    UPDATE SplitCustomers SET first_name = splitted_value WHERE interface_id = id;
    WHEN 8 THEN
    UPDATE SplitCustomers SET phone_1 = splitted_value WHERE interface_id = id;
    WHEN 9 THEN
    UPDATE SplitCustomers SET phone_2 = splitted_value WHERE interface_id = id;
    WHEN 10 THEN
    UPDATE SplitCustomers SET fax = splitted_value WHERE interface_id = id;
    WHEN 11 THEN
    UPDATE SplitCustomers SET address_1 = splitted_value WHERE interface_id = id;
    WHEN 12 THEN
    UPDATE SplitCustomers SET address_2 = splitted_value WHERE interface_id = id;
    WHEN 13 THEN
    UPDATE SplitCustomers SET city = splitted_value WHERE interface_id = id;
    WHEN 14 THEN
    UPDATE SplitCustomers SET state = splitted_value WHERE interface_id = id;
    WHEN 15 THEN
    UPDATE SplitCustomers SET country = splitted_value WHERE interface_id = id;
    WHEN 16 THEN
    UPDATE SplitCustomers SET zip = splitted_value WHERE interface_id = id;
    WHEN 17 THEN
    UPDATE SplitCustomers SET user_email = splitted_value WHERE interface_id = id;
    WHEN 18 THEN
    UPDATE SplitCustomers SET extra_field_1 = splitted_value WHERE interface_id = id;
    WHEN 19 THEN
    UPDATE SplitCustomers SET extra_field_2 = splitted_value WHERE interface_id = id;
    WHEN 20 THEN
    UPDATE SplitCustomers SET extra_field_3 = splitted_value WHERE interface_id = id;
    END CASE;

    SET i = i + 1;

    CLOSE cur1;
    END;

  7. The next question is, how to do it the other way? Taking several fields and write it
    into one Field. Any ideas?

  8. Pingback: How to transpose a single table row to multiple rows in MySQL « Remember's Blog

  9. One that I’m pondering on at the moment is a variance on the other way around.

    I’m updating a system that I have inherited and it has type fields to describe properties eg: type1=”with pool”, type2=”near beach”, type3=”close to shops” etc…

    I have normalized this into table types and property_type join table:
    id int(11) ~ auto-increment
    property_id int(11) ~ FK to property table
    type_id int(11) ~ FK to type table

    I thought I would be able to use a simple CASE statement thus:
    SELECT Q.prop_id,
    MAX(IF(Q.type = 20,’Y’,’N’)) AS ‘with pool’,
    MAX(IF(Q.type = 21,’Y’,’N’)) AS ‘near beach’,
    MAX(IF(Q.type = 22,’Y’,’N’)) AS ‘close to shops’,
    FROM test Q
    GROUP BY Q.prop_id

    BUT there are more types in the database than there are type columns.

    I want a pure SQL solution as I don’t want to change the website front-end …
    Perhaps, it’s time to wrtie a stored procedure …

    …any ideas ???

  10. Gildus says:

    In the example od tutorial you have to change in the DECLARE:

    DECLARE splitted_value INT;

    by

    DECLARE splitted_value TEXT;

    Regards.

    • Marco says:

      Good point there.

      Here in the example we are working with integers thus the DECLARE splitted_value INT;
      But of course if you want to split text, you should declare it as TEXT.

  11. ewil says:

    My goal was something next of your example, so i have done some changes to fit with my needs, so maybe it can also help someone… It just expects as paramter the string, and a separator character… then it returns the recordset, resuming, it split column into rows…

    Thanks,
    ewil

    THE CODE:

    DELIMITER $$

    DROP PROCEDURE IF EXISTS explodeintotable $$
    CREATE PROCEDURE explodeintotable(str VARCHAR(255), sep VARCHAR(1))
    BEGIN
    DECLARE occurance INT DEFAULT 0;
    DECLARE i INT DEFAULT 0;
    DECLARE splitted_value VARCHAR(100);

    DROP TEMPORARY TABLE IF EXISTS table2;
    CREATE TEMPORARY TABLE table2(
    `vl` VARCHAR(255) NOT NULL
    ) ENGINE=Memory;

    /* QUANTAS OCORRENCIAS TEM */
    SET occurance = (SELECT LENGTH(str)
    – LENGTH(REPLACE(str, sep, ”))
    +1);
    SET i=1;
    WHILE i <= occurance DO
    SET splitted_value =
    (SELECT REPLACE(SUBSTRING(SUBSTRING_INDEX(str, sep, i),
    LENGTH(SUBSTRING_INDEX(str, sep, i – 1)) + 1), sep, ''));

    INSERT INTO table2 VALUES (splitted_value);
    SET i = i + 1;
    END WHILE;

    SELECT * FROM table2;
    END; $$

  12. Very helpful. Thank you

Leave a Reply