URL/File decoupling with Apache mod_rewrite and PHP

In the beginning God created the heavens and the earth, then there was the Apache HTTP Server and afterwards PHP. Back in those ages, i was split between CGIs and mod_php. So the usual URL was something as http://www.mydomain.com/cgi-bin/script.pl?do=this or http://www.mydomain.com/script.php?do=that. It was simply a direct link between script.pl and script.php and a file in the filesystem … also it was ugly as hell…

I was not happy with this at all… and besides, just to make my days more miserable, some websites had perfect URLs, like:

http://www.mydomain.com/product/my_product
http://www.mydomain.com/product/my_other_product

so… i put myself to work to emulate this. First, i tried, to use a directory system and take advantage of the auto-index feature, as the index file is automatically served by the server. Ex:

http://www.mydomain.com/product/my_product/index.html
http://www.mydomain.com/product/my_other_product/index.html

but you could actually access them the way i wanted

http://www.mydomain.com/product/my_product
http://www.mydomain.com/product/my_other_product

This was of course a bad nightmare to maintain, not even to speak about database driven websites and related problems, template problems… nightmare…. So, i moved along to PHP auto prepend feature, the idea was to catch the user request by a php file (that is prepended to each request) do all the parsing and display and then kill the normal page processing. Better, but not quite there yet….

Then i discovered Apache mod_rewrite, and everything made sense, all things, the universe, the meaning of life, even Flash programming (well maybe not Flash programming). With some simple rules i was able to catch the user request and filter the ones that i wanted to a central file (that i call handler.php) parse the request and send it to whatever file/module that i want.

RewriteEngine on
RewriteCond %{REQUEST_URI} !\.(php|xml)$ [NC]
RewriteRule \.[a-z0-9]{1,}$ - [NC,L]
RewriteRule .* %{DOCUMENT_ROOT}/handler.php [L]

What are we doing here is quite simple but at the same time powerful. With these rules, all requests with file extension (.gif, .png, .js, .css, etc, etc), usually static content, are directly served as normal (line 3), except for the requests with .php and .xml extension that are sent to the “handler.php” (line 2), if we want other extensions to be sent to dynamic parsing, ex: server side generated image, just add them in this line.

Then a stripped down handler.php file is something like this

// configurations
require('config/vars.inc.php');
require('config/bd.inc.php');

// session start
session_name(SESSION_NAME);
session_start();

// outputt buffering
ob_start();
  
// get script parts
$uri = $_SERVER['REQUEST_URI'];
$tmp = explode ("?", $uri);
if (! isset($tmp[0])) $tmp[0] = '/';
$script_parts = explode ("/", $tmp[0]);

// clean empty keys
$tmp = array();
foreach($script_parts as $key=>$row)
  if ($row != '') $tmp[] = $row;
$script_parts = $tmp;

// default
if (! isset($script_parts[0])) 
  $script_parts[0] = 'hp';

// Send to execution
switch ($script_parts[0]) {
  case 'hp':
    require($_SERVER['DOCUMENT_ROOT'].'/homepage.php');
    break;
		
  case 'products':
    if (isset($script_parts[1])){
      require($_SERVER['DOCUMENT_ROOT'].'/modules/products/detail.php');
      break;
    }	
		
    require($_SERVER['DOCUMENT_ROOT'].'/modules/products/cat.php');
    break;
		
  case 'php':
    phpinfo();
    break;
		
  default:  // 404 error (not found)
    header("HTTP/1.0 404 Not Found");
    require($_SERVER['DOCUMENT_ROOT'].'/templates/error404.php');	
}

Simple, include all the global stuff, start sessions, database links, etc… get the URL request, parse it and send it to whatever file for processing. But as always you can/should build from here, change it to your needs and/or style, put your secret ingredient, do it better for yourself.

Some (many) years ago i would kick some ass to read this post and get this info on a silver plate.

Control big mean devices with an Arduino

Since Codebits, i had the Arduino in the bag… also in the bag (due to time constraints) the desire to put it to work, to do something with it, anything. Today they both jumped out of the bag, so the goal is to make a remote mains switch with a big red button (the end of the world type), sure i can walk to the switch but this way is much more fun :)…

The first step is to control (switch on/off) a mains powered device with an Arduino, wich operates in low DC voltage. For that i needed a Solid State Relay (other routes possible here), and curious enough that was exactly what i had today in the mailbox from China. These are really cheap from Ebay, just take time to read specs and match up to your needs. Remember, Volts x Amps = Watts

Ex:
a 100w lamp = 220v * x amps = 100W = 0.45 amps
a 3000w heater = 220v * x amps = 3000W = 13.6 amps

so for the heater the SSR should have at least an Output Current of 15 amps (or a bit higher to play safe),  also the 220v should be within the Output Voltage range. The Arduino output Voltage is 5v, so also check the SSR Input Voltage range for 5v support (if not in range, you will not be able to control the SSR with the Arduino alone).

I used an old appliance cable, just strip the wire and there should be 3 wires, the green/yellow cable is the ground, dont touch this one, you should cut one of the other wires (normally a blue or gray). Strip each cutted side and connect the AC side of the SSR to them. Now connect the Arduino to the DC side of the SSR, one digital pin to positive and ground to negative. You can store the tools now.

You can connect now the Arduino to the computer, install the drivers if needed and download the Arduino Environment, just follow the instructions from the Arduino Website and you should be up and running in minutes. The code is as simple as it gets, it reads from serial and if receives 1 the pin goes high (with voltage) and if 0 the pin goes low (with no voltage).

int pinNumber = 13;
int incomingByte;

void setup() {
    Serial.begin(9600);
    pinMode(pinNumber, OUTPUT);
}

void loop() {
    if (Serial.available() > 0) {
        incomingByte = Serial.read();
        Serial.println(incomingByte);
    }

    if (incomingByte == 48) {
        digitalWrite(pinNumber, LOW);
    } else if (incomingByte == 49)  {
        digitalWrite(pinNumber, HIGH);
    }
}

I choose pin 13 (Arduino to SSR positive) because there is a built in led that can help in debug. You can test now from the Arduino Environment, simply open the Serial Console (under Tools) and send 1 and 0 and it should light up and down. You can also connect using the Putty – a fine ssh client with serial interface or even the venerable Windows Hyperterminal. Just remember this is serial, so one client at a time :).

When you connect/disconnect from any client there was a rapid light flicker. First i thought that it was something related to the communication that was sending zeros and ones in the handshake or something. But i was wrong (normal), this is a feature of the Arduino to simplify and automate new programs upload. For every serial connection it resets itself, hence the flicker, and waits for a new program upload (sketch in Arduino lingo) for a couple of seconds then if there is not nothing being upload it proceeds to the normal program execution from the start (with state loss). You can disable the auto-reset feature to meet your needs if you want.

Of course you want to control it in some programaticaly way, so me first try was with PHP, there is some code floating around in the internets, something like:

$port = fopen('COM1', 'w'); // COM number where the Arduino is
fwrite($port, '1');
fclose($port);

and to light off something like:

$port = fopen('COM1', 'w'); // COM number where the Arduino is
fwrite($port, '0');
fclose($port);

BUT THIS CODE OBVIOUSLY WON’T WORK ON CURRENT STANDARDS ARDUINO, due to the auto-reset feature that i mention in the previous paragraph. It will open the COM (reset), then it will send too fast the bit to the port, when the device is still on the auto-upload sketch mode (witch is the reason of the auto reset in the first place), then it will close the connection (reset again). So you will only end up with some fast light flicker….

This code will work:

if ($port = fopen('COM1', 'w')) { // open arduino port
    sleep(2);                     // wait for end of auto-reset 

    $i = 1;
    while (true) {                // loop to keep port open
        if ($i % 2)               // if i is even lights on
            fwrite($port, '1');
        else
            fwrite($port, '0');   // else lights off

        sleep(2);                 // waits 2 secs between each cycle
        $i++;
    }
    fclose($port);
} else {
    print("Check port number or previous active session");
    die();
}

from here you could work it out, to make a daemon that connects to the Arduino via serial and listens to some socket and routes/proxies the input from socket to serial.  It’s doable, and not that hard but anyway PHP is not the correct tool for the job, at all… so used the Serproxy – a proxy program for redirecting network socket connections to/from serial links – very easy to use and makes just what i wanted.

From there you simply connect and send the on/off (1/0) command to the socket, and don’t have to worry about auto-resets. When you close the socket connection the serial link always stays up. So, making a web interface from here was simple, and it was what i have done just for the fun of it (PHP works good here).

Here it is working:

You can light my desk lamp now 🙂 on http://light.waynext.com/, sorry no upload bandwidth to put a webcam stream (Porto Salvo = Juda’s ass) so you have to trust my word, call to Waynext or check out the video.

Your RGI (Reality Gateway Interface) is now complete. Next step is to put it to work remotely, with the Arduino unconnected from the computer, must dig into xbee shields. Also to dig into a direct interface with PERL, Python or Java.

PHP regexp replace word(s) in html string if not inside tags

The problem, was to find and replace text inside HTML (without breaking the HTML), take for example this example string:

<img title=”My image” alt=”My image” src=”/gfx/this is my image.gif”><p>This is my string</p>

and you want to replace the string “my” to another string or to enclose it inside another tag (let’s assume <strong></strong>), but only the “my” outside the html tags. So after the transformation it would look like:

<img title=”My image” alt=”My image” src=”/gfx/this is my image.gif”><p>This is <strong>my</strong> string</p>

With PHP Regular Expression functions, the typical solution find and replace with word boundary fails here.

preg_replace('/\b(my)\b/i',
             '<strong>$1</strong>',
             $html_string);

you will end up with messed up html

<img title=”<strong>My</strong> image” alt=”<strong>My</strong> image” src=”/gfx/this is <strong>my</strong> image.gif”><p>This is <strong>my</strong> string</p>

now think the wonderful mess that would be if you are replacing the words like “form” or “alt” that can be a text node, a html tag or attribute….

So how to fix this? I figured that the only common thing to all tags is the open and close character, the < and >, from here you simply search the word you want to replace and the next close tag char (the > sign), and within the matched result, you try to find a open tag char, if you don’t find an open tag you are within a tag, so you abort the replace. Here it is the code:

function checkOpenTag($matches) {
    if (strpos($matches[0], '<') === false) {
        return $matches[0];
    } else {
        return '<strong>'.$matches[1].'</strong>'.$matches[2];
    }
}

preg_replace_callback('/(\bmy\b)(.*?>)/i',
                      'checkOpenTag',
                      $html_string);

If you are going to use this kind of code to implement several words search in a HTML text (ex: a glossary implementation) test for performance and do think about a caching system.

That’s it, remember as this solution worked fine for me, it also can work terribly bad for you so proceed at your own risk (aka liability disclaimer).

UPDATE 19-04-14
There was a comment about this post that warms about only the first occurrence being replaced in an HTML segment. So, there is an updated version of the PHP example with this issue corrected:

<?

class replaceIfNotInsideTags {

  private function checkOpenTag($matches) {
    if (strpos($matches[0], '<') === false) {
      return $matches[0];
    } else {
      return '<strong>'.$matches[1].'</strong>'.$this->doReplace($matches[2]);
    }
  }

  private function doReplace($html) {
    return preg_replace_callback('/(\b'.$this->word.'\b)(.*?>)/i',
                                 array(&$this, 'checkOpenTag'),
                                 $html);
  }

  public function replace($html, $word) {
    $this->word = $word;

    return $this->doReplace($html);
  }
}

$html = '<p>my bird is my life is my dream</p>';

$obj = new replaceIfNotInsideTags();
echo $obj->replace($html, 'my');

?>

Mysql split column string into rows

A MySQL recipe, that you can use to split a cell value by a known separator into different rows, in some way similar to the PHP explode function or split in PERL.

To turn this:

id value
1 4,5,7
2 4,5
3 4,5,6
…. ….

Into this

id value
1 4
1 5
1 7
2 4
2 5
3 4
3 5
3 6
…. ….

You can simply write and call a stored procedure

DELIMITER $$

DROP PROCEDURE IF EXISTS explode_table $$
CREATE PROCEDURE explode_table(bound VARCHAR(255))

  BEGIN

    DECLARE id INT DEFAULT 0;
    DECLARE value TEXT;
    DECLARE occurance INT DEFAULT 0;
    DECLARE i INT DEFAULT 0;
    DECLARE splitted_value INT;
    DECLARE done INT DEFAULT 0;
    DECLARE cur1 CURSOR FOR SELECT table1.id, table1.value
                                         FROM table1
                                         WHERE table1.value != '';
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

    DROP TEMPORARY TABLE IF EXISTS table2;
    CREATE TEMPORARY TABLE table2(
    `id` INT NOT NULL,
    `value` VARCHAR(255) NOT NULL
    ) ENGINE=Memory;

    OPEN cur1;
      read_loop: LOOP
        FETCH cur1 INTO id, value;
        IF done THEN
          LEAVE read_loop;
        END IF;

        SET occurance = (SELECT LENGTH(value)
                                 - LENGTH(REPLACE(value, bound, ''))
                                 +1);
        SET i=1;
        WHILE i <= occurance DO
          SET splitted_value =
          (SELECT REPLACE(SUBSTRING(SUBSTRING_INDEX(value, bound, i),
          LENGTH(SUBSTRING_INDEX(value, bound, i - 1)) + 1), ',', ''));

          INSERT INTO table2 VALUES (id, splitted_value);
          SET i = i + 1;

        END WHILE;
      END LOOP;

      SELECT * FROM table2;
    CLOSE cur1;
  END; $$

Then you simply call it

CALL explode_table(',');
There it is the bare bones. From here it’s simple to adapt and build to your own needs, like adding some kind of filter parameter, order, etc… if your main interface to Mysql is PHPMyAdmin (as of now) forget it, its rubish with this procedures queries, you can use own MySQL GUI – MySQL Workbench – to interface with, or rely on the old CLI ‘mysql’ command, just put the stored procedure definition in a file and load it with a redirect:

mysql -u username -p -D databasename < procedure_definition_file.txt

Also remember:

  • if backups are made with mysqldump, use the –routines switch so the stored procedure definition goes in the dumps.
  • works mysql >= 5.0 only
  • performance, normalization and concurrency – this is not the correct way to do a many to many relationship with a RDBS, you should use a relationship table, and joins to work with it.
  • OK, so your project manager/marketing/boss changed the game rules at the very last moment, and to implement it correctly you must rework a lot of code, i understand 🙂 but even then enter this road at your own peril.

Rework by 37 Signals

Like Gordon Gekko once said “Because everyone is drinking the same Kool Aid“, and just because everybody in the business (the web business i mean) is drinking “Rework” by 37 Signals, i also drinked it too…. so what’s my taste of this book?

It’s a complex taste book, not because it digs deep the rabbit-hole, but because it (tries to) speak all things about the business universe, it goes from (unordered list) planning, to meetings, to time management, customer management, task prioritization, hiring and firing, office policies, marketing, product building, product minimalism, workaholism, by-products, productivity, startups, etc, etc….

It’s filled with common sense (is not so common) and Lapalissades, witch makes one feel smart:

«Failure is not a prerequisite of success.» – I knew that

«Forgoing sleep is a bad ideia.» – I also knew that

«Other people’s failures are just that: other people’s failures» – Duhh

«Revenue in, expenses out, Turn a profit or wind up gone.» – Heck, even the tavern owner where i go for cheap drinks knows this

«If you want to get someones attention, it’s silly to do exactly the same thing as everyone else.» –  I rest my case

But in the other hand, you are always making reality checks, comparing your own practices with the ones described in the book, and this review is obviously good.

Anyway, the work smart not hard philosophy makes sense, there are some good marketing tips, i really liked the teach and spread your secrets of the trade approach. It also makes a strong point about minimalistic products, those products that you strip down to the core, make them easier, cheaper, maintainable. Likewise, they don’t like guys throwed in suits (about#hate) and useless meetings (about#hate) ….. ahhhh…that was good for my ego.

The final balance is positive, everyone can get good ideias out of it, but is not the “fabulous”, “best book in my life and afterlife” hype that you read in Amazon reviews.

Here some of my favorite quotes:

«When you treat people like children, you get children’s work»

«And when everything is high priority, nothing is»

«Business are usually paranoid and secretive. They think they have proprietary this and competitive advantage that. Maybe a rare few do, but most don’t»

«Having the idea for eBay has nothing to do with actually creating eBay»

«The worst interruptions of all are meetings»

«How long someone’s been doing it is overrated. What matters is how well they’ve been doing it»