Jonathan Hilgeman

Everything complex is made up of simpler things.

Use filter_var to Strip Out Non-Digit Characters

Mar-28-2018

Here’s a fun tidbit – how many times have you used a simple regular expression like this:

$x = preg_replace("/[^0-9]/","", $y);

…to quickly strip out all non-digit characters in a string? I’ve done it probably a hundred times over the years, but I ran across a script today that was packed FULL of code that used regular expressions where they didn’t need to be used, and it was taking about 100 milliseconds to run on each loop (and there were thousands of loops).

I know the character-by-character parsing in C# is extremely fast, so I first tried to do the same thing with a user function in PHP like this:

function stripNonDigits($str)
{
  $new = "";
  $len = strlen($str);
  for($i = 0; $i < $len; $i++)
  {
    $c = $str[$i];
    if(($c >= '0') || ($c <= '9'))
    {
      $new .= $c;
    }
  }
  return $new;
}

I assumed that wasn’t going to work all that well since it wasn’t compiled code, and I was right. The overhead alone from calling a user function almost surpassed the regex performance times, and once I added in the character comparison and string concatenation, it was all over. It was way slower than regex (over 2x as slow)!

I then figured that PHP had to have a way to do this, and I remembered filter_var had a filter for integer numbers, although it didn’t strip out + or – signs, so I set up another test where the filtering line was just:

$signs = array("-","+");
$x = str_replace($signs,"",filter_var($y, FILTER_SANITIZE_NUMBER_INT));

The result was blistering faster compared to preg_replace. I ran the same code against a variety of data sources – from 64-byte strings to 256k strings, and filter_var() consistently outperformed all other methods. The only way I can think of to get better performance is to build a custom PHP extension that would strip out the plus and minus signs as well, but this is about as good as we can get it using standard PHP functionality.

So next time you reach for regex, check out filter_var() first!

UltraEdit

May-30-2016

URL: http://www.ultraedit.com

Quick Summary


One of the best investments I ever made was to pony up a few extra bucks over a decade ago to buy the lifetime license for UltraEdit. It’s lightning fast, has more features than a dozen Swiss army knives, has fantastic, responsive and personal support (no outsourced tech support that claims their name is “Ken” and only responds with canned messages), and is just a an all-around fantastic editor for anything text-based (code, XML, etc).

Read the rest of the article...

Security Task Manager

May-30-2016

URL: http://www.neuber.com/taskmanager/

Quick Summary


When I’m checking out a system for malware, one of my first stops is to install Neuber’s Security Task Manager. I came across this little gem several years ago, when a client asked me to investigate the “Case of the Missing Space”. Basically, their drives were constantly losing free space, and none of the regular tools like TreeSize were able to determine where the massive amounts of used bytes were, and the regular task manager wasn’t showing any weird activity, but the server was acting very strange.

Read the rest of the article...

Comodo Internet Security Review

May-30-2016

Quick Summary


For the past year, I’ve been tight with Comodo Internet Security Pro. I have it on just about every box I own, including my wife’s computer and my parents’ computers. It’s done a fairly good job so far and has some well-rounded features (stateful antivirus for better performance, behavior-based protection, auto-sandboxing of new and untrusted apps, a comprehensive default list of trusted software publishers, a firewall, etc…). Like pretty much all security software, it does the whole “Do you want to allow X to do Y?” messages that are sometimes cryptic, but the “Trusted Vendors” list keeps those messages to a minimum, which makes it a good option for keeping my less-technical parents safe.

Read the rest of the article...