Valhalla Legends Forums Archive | Web Development | Cleaning User input - php example

AuthorMessageTime
LordVader
I've read a few threads that discuss data sanitation and this is always a subject of debate..

But this is what i've found to be most usefull and avoids having to depend on magic quotes or anything else.

Example code:
[code]
// This function included is a copy of phpbb_rtrim();
function data_rtrim($str, $charlist = false)
{
    if ($charlist === false)
    {
        return rtrim($str);
    }

    $php_version = explode('.', PHP_VERSION);

    // php version < 4.1.0
    if ((int) $php_version[0] < 4 || ((int) $php_version[0] == 4 && (int) $php_version[1] < 1))
    {
        while ($str{strlen($str)-1} == $charlist)
        {
            $str = substr($str, 0, strlen($str)-1);
        }
    }
    else
    {
        $str = rtrim($str, $charlist);
    }

    return $str;
}

// These functions parse out new lines, cariage returns
// and other bad data passed from the input fields
// Short input field 35 characters max
function cleanShort($data){
    // Monster header injection defence
    $data = str_replace("\n", NULL, $data);
    $data = str_replace("\r", NULL, $data);
    $data = str_replace("\t", NULL, $data);
    $data = str_replace("0x0D", NULL, $data);
    $data = str_replace("0x0A", NULL, $data);
    $data = str_replace("%0D", NULL, $data);
    $data = str_replace("%0A", NULL, $data);
    $data = str_replace("0x", NULL, $data);
    $data = str_replace("%0", NULL, $data);
    // Send data to be cleaned up.
    $data = substr(htmlspecialchars(str_replace("\'", "'", trim($data))), 0, 35);
    $data = data_rtrim($data, "\\");
    $data = str_replace("'", "\'", $data);
    return $data;
}
// Medium input field 55 characters max
function cleanMedium($data){
    // Monster header injection defence
    $data = str_replace("\n", NULL, $data);
    $data = str_replace("\r", NULL, $data);
    $data = str_replace("\t", NULL, $data);
    $data = str_replace("0x0D", NULL, $data);
    $data = str_replace("0x0A", NULL, $data);
    $data = str_replace("%0D", NULL, $data);
    $data = str_replace("%0A", NULL, $data);
    $data = str_replace("0x", NULL, $data);
    $data = str_replace("%0", NULL, $data);
    // Send data to be cleaned up.
    $data = substr(htmlspecialchars(str_replace("\'", "'", trim($data))), 0, 55);
    $data = data_rtrim($data, "\\");
    $data = str_replace("'", "\'", $data);

    return $data;
}
// Long input field 999 characters max
function cleanLong($data){
    // Monster header injection defence
    $data = str_replace("\n", NULL, $data);
    $data = str_replace("\r", NULL, $data);
    $data = str_replace("\t", NULL, $data);
    $data = str_replace("0x0D", NULL, $data);
    $data = str_replace("0x0A", NULL, $data);
    $data = str_replace("%0D", NULL, $data);
    $data = str_replace("%0A", NULL, $data);
    $data = str_replace("0x", NULL, $data);
    $data = str_replace("%0", NULL, $data);
    // Send data to be cleaned up.
    $data = substr(htmlspecialchars(str_replace("\'", "'", trim($data))), 0, 999);
    $data = data_rtrim($data, "\\");
    $data = str_replace("'", "\'", $data);

    return $data;
}
[/code]

As you can see it's split up into expected data lengths (types?) so you can customize how you parse each.
This allows stuff such as logins or emails (small text blobs) to be parsed differently then something large ie: submitting a story(large text blobs) as this may need to contain quotes or links and things of that nature.. each would need a different function to properly handle single/double quotes etc...
September 16, 2006, 9:53 AM

Search