html posts

Quick regex to strip html tags

Recently, I needed to strip some HTML tags from some data. The goal was to make a field in a database that was a WYSIWYG text area into plain text content that could go inside a link. I did it using a simple regex of /<\/?[^>]+>/ to find the tags so I could replace them with an empty string. In PHP, this looked like:

$string = preg_replace('/<\/?[^>]+>/', '', $string);

This is perhaps a naïve implementation, but it served my purposes fine. Of course, I had totally forgotten about PHP’s built in strip_tags() function, but on comparing it, it also seems to not do exactly what I want. For instance, it seems to get rid of the content of <a> tags.


Web app manifest, first go

I’ve added a basic web app manifest to my site. I have not experimented with the results, but I did run it through a web manifest validator mostly to success. I used the MDN guide and the HTML5 doctor article for help. I also read some of the in-progress spec, though it seemed more implementer-friendly. The content of my manifest is currently (prettified):

{
    "background_color": "#4e784e"
    ,"display": "browser"
    ,"icons": [
        {
            "sizes": "64x64"
            ,"src": "favicon.gif"
            ,"type": "image\/gif"
        }
    ]
    ,"lang": "en-US"
    ,"name": "Toby Mackenzie\u0027s site"
    ,"scope": "\/"
    ,"short_name": "\u003Ctoby\u003E"
    ,"start_url": "\/"
    ,"theme_color": "#4e784e"
}

I’m just using Symfony’s JsonResponse object to render a PHP array.

This is one more thing that I really shouldn’t’ve put time into until my site is more fleshed out, but it seemed cool and simple to add.


TMCom Goes HTML 5 (Doctype anyway)

I’ve finally joined the bandwagon and moved my site (not this blog) to the HTML 5 doctype. It is much simpler than previous doctypes:

<!DOCTYPE html>

breaking away from the SGML standard of including a reference to the DTD for that doctype. I’m not sure how this will play out as HTML moves beyond 5, but I’m sure it won’t be a problem for a while anyway. And hopefully with all this time they are taking to finalize the specifications, we won’t have problems with backward compatibility, future expansion, validation of old documents, etc.

Anyway, I had considered using the doctype a while back but abandoned it for reasons I don’t remember. The “role” attribute, which I first noticed in WordPress themes, is what got me to consider HTML 5 again. It offers potential accessibility benefits to user agents that know about it by specifying what an elements “role” is in relation to its document: navigation, banner, main, contentinfo, etc. HTML 5 offers elements with similar meanings, but they are not supported well. The attribute is not valid in XHTML 1 (it was proposed for XHTML 2, which never came about), and my attempts at an alternative doctype failed.

Continue reading post "TMCom Goes HTML 5 (Doctype anyway)"