html posts

Quick regex to strip html tags

Recently, I needed to strip some HTML tags from some data. The goal was to make a field in a database that was a WYSIWYG text area into plain text content that could go inside a link. I did it using a simple regex of /<\/?[^>]+>/ to find the tags so I could replace them with an empty string. In PHP, this looked like:

$string = preg_replace('/<\/?[^>]+>/', '', $string);

This is perhaps a naïve implementation, but it served my purposes fine. Of course, I had totally forgotten about PHP’s built in strip_tags() function, but on comparing it, it also seems to not do exactly what I want. For instance, it seems to get rid of the content of <a> tags.

Continue reading post "Quick regex to strip html tags"

Web app manifest, first go

I’ve added a basic web app manifest to my site. I have not experimented with the results, but I did run it through a web manifest validator mostly to success. I used the MDN guide and the HTML5 doctor article for help. I also read some of the in-progress spec, though it seemed more implementer-friendly. The content of my manifest is currently (prettified):

    "background_color": "#4e784e"
    ,"display": "browser"
    ,"icons": [
            "sizes": "64x64"
            ,"src": "favicon.gif"
            ,"type": "image\/gif"
    ,"lang": "en-US"
    ,"name": "Toby Mackenzie\u0027s site"
    ,"scope": "\/"
    ,"short_name": "\u003Ctoby\u003E"
    ,"start_url": "\/"
    ,"theme_color": "#4e784e"

I’m just using Symfony’s JsonResponse object to render a PHP array.

This is one more thing that I really shouldn’t’ve put time into until my site is more fleshed out, but it seemed cool and simple to add.

Continue reading post "Web app manifest, first go"

TMCom Goes HTML 5 (Doctype anyway)

I’ve finally joined the bandwagon and moved my site (not this blog) to the HTML 5 doctype. It is much simpler than previous doctypes:

<!DOCTYPE html>

breaking away from the SGML standard of including a reference to the DTD for that doctype. I’m not sure how this will play out as HTML moves beyond 5, but I’m sure it won’t be a problem for a while anyway. And hopefully with all this time they are taking to finalize the specifications, we won’t have problems with backward compatibility, future expansion, validation of old documents, etc.

Anyway, I had considered using the doctype a while back but abandoned it for reasons I don’t remember. The “role” attribute, which I first noticed in WordPress themes, is what got me to consider HTML 5 again. It offers potential accessibility benefits to user agents that know about it by specifying what an elements “role” is in relation to its document: navigation, banner, main, contentinfo, etc. HTML 5 offers elements with similar meanings, but they are not supported well. The attribute is not valid in XHTML 1 (it was proposed for XHTML 2, which never came about), and my attempts at an alternative doctype failed.

Continue reading post "TMCom Goes HTML 5 (Doctype anyway)"