timgws / CleanHTML by timgws

Quickly & Easily clean out HTML text, making sure that only the bare minimum is left behind
115
6
3
Package Data
Maintainer Username: timgws
Maintainer Contact: tim@timg.ws (Tim Groeneveld)
Package Create Date: 2013-06-13
Package Last Update: 2016-03-11
Home Page:
Language: PHP
License: MIT
Last Refreshed: 2024-12-15 15:15:52
Package Statistics
Total Downloads: 115
Monthly Downloads: 0
Daily Downloads: 0
Total Stars: 6
Total Watchers: 3
Total Forks: 2
Total Open Issues: 0

CleanHTML

Test Coverage Code Climate

Making HTML clean since late 2012!

Requirements

  • PHP 5.2+
  • php-xml

How to install

    composer require timgws/cleanhtml

How to use


use timgws\CleanHTML\CleanHTML;
$tidy = new CleanHTML();
$output = $tidy->clean('<p><strong>I need a shower. I am dirty HTML.</strong>');

$output should now contain:

<h2>I need a shower. I am dirty HTML.</h2>

Using the Clean function will remove tables, any Javascript or other non-friendly items that you might not want to see from user submitted HTML.

If you want to see some examples, the best place to look would be some of the CleanHTML test

What does it do?

  1. Removed additional spaces from HTML
  2. Replaces multiple <br /> tags with paragraph tags
  3. Removes any <script> tags
  4. Renames any <h1> tags to <h2>
  5. Changes <p><strong> tags to <h2>
  6. Replaces <h2><strong> with just <h2> tags
  7. Removes weird <p><span> tags
  8. Uses HTML purifier to only allow h1,h2,h3,h4,h5,p,strong,b,ul,ol,li,hr,pre,code tags
  9. Runs steps 3->7 one more time, just to catch anything that might have missed by allowed tags
  10. Outputs nice clean HTML \o/