URL/File decoupling with Apache mod_rewrite and PHP

In the beginning God created the heavens and the earth, then there was the Apache HTTP Server and afterwards PHP. Back in those ages, i was split between CGIs and mod_php. So the usual URL was something as http://www.mydomain.com/cgi-bin/script.pl?do=this or http://www.mydomain.com/script.php?do=that. It was simply a direct link between script.pl and script.php and a file in the filesystem … also it was ugly as hell…

I was not happy with this at all… and besides, just to make my days more miserable, some websites had perfect URLs, like:

http://www.mydomain.com/product/my_product
http://www.mydomain.com/product/my_other_product

so… i put myself to work to emulate this. First, i tried, to use a directory system and take advantage of the auto-index feature, as the index file is automatically served by the server. Ex:

http://www.mydomain.com/product/my_product/index.html
http://www.mydomain.com/product/my_other_product/index.html

but you could actually access them the way i wanted

http://www.mydomain.com/product/my_product
http://www.mydomain.com/product/my_other_product

This was of course a bad nightmare to maintain, not even to speak about database driven websites and related problems, template problems… nightmare…. So, i moved along to PHP auto prepend feature, the idea was to catch the user request by a php file (that is prepended to each request) do all the parsing and display and then kill the normal page processing. Better, but not quite there yet….

Then i discovered Apache mod_rewrite, and everything made sense, all things, the universe, the meaning of life, even Flash programming (well maybe not Flash programming). With some simple rules i was able to catch the user request and filter the ones that i wanted to a central file (that i call handler.php) parse the request and send it to whatever file/module that i want.

RewriteEngine on
RewriteCond %{REQUEST_URI} !\.(php|xml)$ [NC]
RewriteRule \.[a-z0-9]{1,}$ - [NC,L]
RewriteRule .* %{DOCUMENT_ROOT}/handler.php [L]

What are we doing here is quite simple but at the same time powerful. With these rules, all requests with file extension (.gif, .png, .js, .css, etc, etc), usually static content, are directly served as normal (line 3), except for the requests with .php and .xml extension that are sent to the “handler.php” (line 2), if we want other extensions to be sent to dynamic parsing, ex: server side generated image, just add them in this line.

Then a stripped down handler.php file is something like this

// configurations
require('config/vars.inc.php');
require('config/bd.inc.php');

// session start
session_name(SESSION_NAME);
session_start();

// outputt buffering
ob_start();
  
// get script parts
$uri = $_SERVER['REQUEST_URI'];
$tmp = explode ("?", $uri);
if (! isset($tmp[0])) $tmp[0] = '/';
$script_parts = explode ("/", $tmp[0]);

// clean empty keys
$tmp = array();
foreach($script_parts as $key=>$row)
  if ($row != '') $tmp[] = $row;
$script_parts = $tmp;

// default
if (! isset($script_parts[0])) 
  $script_parts[0] = 'hp';

// Send to execution
switch ($script_parts[0]) {
  case 'hp':
    require($_SERVER['DOCUMENT_ROOT'].'/homepage.php');
    break;
		
  case 'products':
    if (isset($script_parts[1])){
      require($_SERVER['DOCUMENT_ROOT'].'/modules/products/detail.php');
      break;
    }	
		
    require($_SERVER['DOCUMENT_ROOT'].'/modules/products/cat.php');
    break;
		
  case 'php':
    phpinfo();
    break;
		
  default:  // 404 error (not found)
    header("HTTP/1.0 404 Not Found");
    require($_SERVER['DOCUMENT_ROOT'].'/templates/error404.php');	
}

Simple, include all the global stuff, start sessions, database links, etc… get the URL request, parse it and send it to whatever file for processing. But as always you can/should build from here, change it to your needs and/or style, put your secret ingredient, do it better for yourself.

Some (many) years ago i would kick some ass to read this post and get this info on a silver plate.

Blue pill, red pill

After this, there is no turning back. You take the blue pill – the story ends, you wake up in your bed and believe whatever you want to believe. You take the red pill – you stay in Wonderland and I show you how deep the rabbit-hole goes.

So, let’s see how deep this goes.

I wanna wake up! Tech support! It’s a nightmare! Tech support! Tech support!

Control big mean devices with an Arduino

Since Codebits, i had the Arduino in the bag… also in the bag (due to time constraints) the desire to put it to work, to do something with it, anything. Today they both jumped out of the bag, so the goal is to make a remote mains switch with a big red button (the end of the world type), sure i can walk to the switch but this way is much more fun :)…

The first step is to control (switch on/off) a mains powered device with an Arduino, wich operates in low DC voltage. For that i needed a Solid State Relay (other routes possible here), and curious enough that was exactly what i had today in the mailbox from China. These are really cheap from Ebay, just take time to read specs and match up to your needs. Remember, Volts x Amps = Watts

Ex:
a 100w lamp = 220v * x amps = 100W = 0.45 amps
a 3000w heater = 220v * x amps = 3000W = 13.6 amps

so for the heater the SSR should have at least an Output Current of 15 amps (or a bit higher to play safe),  also the 220v should be within the Output Voltage range. The Arduino output Voltage is 5v, so also check the SSR Input Voltage range for 5v support (if not in range, you will not be able to control the SSR with the Arduino alone).

I used an old appliance cable, just strip the wire and there should be 3 wires, the green/yellow cable is the ground, dont touch this one, you should cut one of the other wires (normally a blue or gray). Strip each cutted side and connect the AC side of the SSR to them. Now connect the Arduino to the DC side of the SSR, one digital pin to positive and ground to negative. You can store the tools now.

You can connect now the Arduino to the computer, install the drivers if needed and download the Arduino Environment, just follow the instructions from the Arduino Website and you should be up and running in minutes. The code is as simple as it gets, it reads from serial and if receives 1 the pin goes high (with voltage) and if 0 the pin goes low (with no voltage).

int pinNumber = 13;
int incomingByte;

void setup() {
    Serial.begin(9600);
    pinMode(pinNumber, OUTPUT);
}

void loop() {
    if (Serial.available() > 0) {
        incomingByte = Serial.read();
        Serial.println(incomingByte);
    }

    if (incomingByte == 48) {
        digitalWrite(pinNumber, LOW);
    } else if (incomingByte == 49)  {
        digitalWrite(pinNumber, HIGH);
    }
}

I choose pin 13 (Arduino to SSR positive) because there is a built in led that can help in debug. You can test now from the Arduino Environment, simply open the Serial Console (under Tools) and send 1 and 0 and it should light up and down. You can also connect using the Putty – a fine ssh client with serial interface or even the venerable Windows Hyperterminal. Just remember this is serial, so one client at a time :).

When you connect/disconnect from any client there was a rapid light flicker. First i thought that it was something related to the communication that was sending zeros and ones in the handshake or something. But i was wrong (normal), this is a feature of the Arduino to simplify and automate new programs upload. For every serial connection it resets itself, hence the flicker, and waits for a new program upload (sketch in Arduino lingo) for a couple of seconds then if there is not nothing being upload it proceeds to the normal program execution from the start (with state loss). You can disable the auto-reset feature to meet your needs if you want.

Of course you want to control it in some programaticaly way, so me first try was with PHP, there is some code floating around in the internets, something like:

$port = fopen('COM1', 'w'); // COM number where the Arduino is
fwrite($port, '1');
fclose($port);

and to light off something like:

$port = fopen('COM1', 'w'); // COM number where the Arduino is
fwrite($port, '0');
fclose($port);

BUT THIS CODE OBVIOUSLY WON’T WORK ON CURRENT STANDARDS ARDUINO, due to the auto-reset feature that i mention in the previous paragraph. It will open the COM (reset), then it will send too fast the bit to the port, when the device is still on the auto-upload sketch mode (witch is the reason of the auto reset in the first place), then it will close the connection (reset again). So you will only end up with some fast light flicker….

This code will work:

if ($port = fopen('COM1', 'w')) { // open arduino port
    sleep(2);                     // wait for end of auto-reset 

    $i = 1;
    while (true) {                // loop to keep port open
        if ($i % 2)               // if i is even lights on
            fwrite($port, '1');
        else
            fwrite($port, '0');   // else lights off

        sleep(2);                 // waits 2 secs between each cycle
        $i++;
    }
    fclose($port);
} else {
    print("Check port number or previous active session");
    die();
}

from here you could work it out, to make a daemon that connects to the Arduino via serial and listens to some socket and routes/proxies the input from socket to serial.  It’s doable, and not that hard but anyway PHP is not the correct tool for the job, at all… so used the Serproxy – a proxy program for redirecting network socket connections to/from serial links – very easy to use and makes just what i wanted.

From there you simply connect and send the on/off (1/0) command to the socket, and don’t have to worry about auto-resets. When you close the socket connection the serial link always stays up. So, making a web interface from here was simple, and it was what i have done just for the fun of it (PHP works good here).

Here it is working:

You can light my desk lamp now 🙂 on http://light.waynext.com/, sorry no upload bandwidth to put a webcam stream (Porto Salvo = Juda’s ass) so you have to trust my word, call to Waynext or check out the video.

Your RGI (Reality Gateway Interface) is now complete. Next step is to put it to work remotely, with the Arduino unconnected from the computer, must dig into xbee shields. Also to dig into a direct interface with PERL, Python or Java.

PHP regexp replace word(s) in html string if not inside tags

The problem, was to find and replace text inside HTML (without breaking the HTML), take for example this example string:

<img title=”My image” alt=”My image” src=”/gfx/this is my image.gif”><p>This is my string</p>

and you want to replace the string “my” to another string or to enclose it inside another tag (let’s assume <strong></strong>), but only the “my” outside the html tags. So after the transformation it would look like:

<img title=”My image” alt=”My image” src=”/gfx/this is my image.gif”><p>This is <strong>my</strong> string</p>

With PHP Regular Expression functions, the typical solution find and replace with word boundary fails here.

preg_replace('/\b(my)\b/i',
             '<strong>$1</strong>',
             $html_string);

you will end up with messed up html

<img title=”<strong>My</strong> image” alt=”<strong>My</strong> image” src=”/gfx/this is <strong>my</strong> image.gif”><p>This is <strong>my</strong> string</p>

now think the wonderful mess that would be if you are replacing the words like “form” or “alt” that can be a text node, a html tag or attribute….

So how to fix this? I figured that the only common thing to all tags is the open and close character, the < and >, from here you simply search the word you want to replace and the next close tag char (the > sign), and within the matched result, you try to find a open tag char, if you don’t find an open tag you are within a tag, so you abort the replace. Here it is the code:

function checkOpenTag($matches) {
    if (strpos($matches[0], '<') === false) {
        return $matches[0];
    } else {
        return '<strong>'.$matches[1].'</strong>'.$matches[2];
    }
}

preg_replace_callback('/(\bmy\b)(.*?>)/i',
                      'checkOpenTag',
                      $html_string);

If you are going to use this kind of code to implement several words search in a HTML text (ex: a glossary implementation) test for performance and do think about a caching system.

That’s it, remember as this solution worked fine for me, it also can work terribly bad for you so proceed at your own risk (aka liability disclaimer).

UPDATE 19-04-14
There was a comment about this post that warms about only the first occurrence being replaced in an HTML segment. So, there is an updated version of the PHP example with this issue corrected:

<?

class replaceIfNotInsideTags {

  private function checkOpenTag($matches) {
    if (strpos($matches[0], '<') === false) {
      return $matches[0];
    } else {
      return '<strong>'.$matches[1].'</strong>'.$this->doReplace($matches[2]);
    }
  }

  private function doReplace($html) {
    return preg_replace_callback('/(\b'.$this->word.'\b)(.*?>)/i',
                                 array(&$this, 'checkOpenTag'),
                                 $html);
  }

  public function replace($html, $word) {
    $this->word = $word;

    return $this->doReplace($html);
  }
}

$html = '<p>my bird is my life is my dream</p>';

$obj = new replaceIfNotInsideTags();
echo $obj->replace($html, 'my');

?>

Lisbon Half Marathon

Finally a sub 2 hour half marathon, exactly 1h57m53s, with a lot of mixed feelings.

The first couple kilomoters, simply impossible to run properly, just a big gymkhana with all kind of non-runners in the way,  from the baby stroller to the old ladys walking hand in hand to avoid getting lost from each others, and a lot more characters in the middle…. the second third of the race did a very good time, with many sub 5 kilometers, probably sub 50m 10ks (couldn’t get all the splits), but just dropped the hammer a bit too soon, so the last 4Ks were hard, stopped and walked a couple of times, maybe the mental working due to the strong pace that somewhere started to seem harder than what i was ready/mentalized to coup with.

Except the feet blisters (as usual….), everything ok at the finish, legs, knees, muscles. Cool. By now, the sub 1h50 seems really doable, without no major change in training or life style.

Anyway, probably not in this race, for sure that is a very scenic and fun course, and the weather usually is fine this time of the year (today a bit too warm though). But have to rethink it next year, 45 minutes in a line to get the bib-numbers, they ran out of time control chips ?? (so no official time for me and others), in the race day, 45 minutes to walk/crawl 500 meters from the train station to the race start, first kilometers you don’t run you gymkhana, near Praça do Comercio gymkhana again, missed 3 aid station water supply due to all the confusion, at the end another 30 minutes just to pass through to the exit…..

Next running objectives:
sub 50m 10K
sub 5m 1500m (that is very hard)