Category Archives: HTML-CSS

Flickr’s new HTML code embedding – how to remove the header and footer

Flickr has altered its default embed HTML to include a header and footer, which includes Flickr branding and the title of the picture.

PT in the Sense8 titles 01

Sometimes I suppose this is okay, but sometimes I just want the picture.

Fortunately it seems to be relatively easy to get rid of. In the example above:

<a data-flickr-embed="true" data-header="true" data-footer="true" href="https://www.flickr.com/photos/danielbowen/19038778583/in/dateposted/" title="PT in the Sense8 titles 01"><img src="https://farm1.staticflickr.com/313/19038778583_3149e7e01a.jpg" width="500" height="282" alt="PT in the Sense8 titles 01"></a><script async src="//embedr.flickr.com/assets/client-code.js" charset="utf-8"></script>

…remove the data-flickr-embed, data-header, and data-footer attributes of the a href, and remove the script tags, like this:

<a href="https://www.flickr.com/photos/danielbowen/19038778583/in/dateposted/" title="PT in the Sense8 titles 01"><img src="https://farm1.staticflickr.com/313/19038778583_3149e7e01a.jpg" width="500" height="282" alt="PT in the Sense8 titles 01"></a>

The result should be just the photo, with the usual linking back to Flickr.

PT in the Sense8 titles 01

It’d be nice if they made this a built-in option when generating the HTML code.

Of course, it also makes me ponder if I should be finding another photo host.

Update 2015-07-20: They seem to have modified their default embedding code a bit so the branding and picture details now only appear over the photo when you mouse over it. Not so objectionable.

PT in the Sense8 titles 01

Flickr’s modified code now excludes data-header="true" data-footer="true" which presumably added the header and footer.

Is Django MVC doing it wrong?

I’ve just starting fooling around with Django (a Python web framework), and was looking to produce a form. Bear in mind that Django doesn’t really do MVC, but follows the philosophy – separation of logic, representation and appearance:

class BookForm(forms.Form):
    title = forms.CharField()

def BookView(request):
    form = BookForm()
    return render_to_response('book.html', {'form': form})

With boot.html containing (amongst other things):

<form action="" method="get">
{{ form.as_table }}
<input type="submit" value="Search" />
</form>

Which is great! MVC, separation of data, presentation and business logic. Now, how do you get a CSS class onto that title field? CSS, being the way of separating out the presentation part of a HTML page from the data that’s embedded in it? As above, but chuck it in as such:

class BookForm(forms.Form):
    title = forms.CharField(
        widget=forms.TextInput(attrs={'class':'title-field'}))

Seeing this crunched the gearbox in my mind. All that messy designer stuff, where they make things look nice, that’s worming it’s way into my business logic? Perhaps it’s not so wrong, as the business logic does indeed know that this is a title-field. But it doesn’t quite sit right with me. I’m not convinced it’s wrong, but if you were, you could instead do this in your CSS and HTML:

<style>
.title-field input {background:#ccC68f;}
</style>
<form action="" method="get">
<table>
<tr><td class="title-field"> {{ form.title }} </td></tr>
</table>
<input type="submit" value="Search" />
</form>

Which pretty much forces you to individually place fields — you get to specify the order of fields plus their individual CSS classes.

I’m not sure what the answer is here. Anyone care to enlighten this noob? Bear in mind that there’s a thing to magically tie a model to a form meaning you don’t even need to specify the fields in both the form and model, which you can’t use if you start tossing styles into each field.

HTML5test.com

Less crazy than the Acid Tests is www.html5test.com

Here’s what I get from a few random browsers I have lying around the place:

Firefox 3.5.9 scores 100 out of 160.

Chrome 4.1 scores 118 out of 160.

IE6? 11 out of 160.

IE8? Surprisingly, only 19 out of 160.

The browser on my Nokia N95 phone doesn’t load the page properly; it just says “Working…” and 0 out of 4 (eg it stalls on the first round of tests).

Interestingly, I also tried IE6 with the Google Chrome Frame in it; it scored 137 out of 160, better than Chrome itself. Weird.

Obviously all the browser authors have a way to go to support this if it’s going to be the bold new standard on the web.

(Found via Andrew)

Tables: MS Word vs CSS

Here’s why I like CSS.

Here’s a table created in Microsoft Word and pasted into a CMS:


<table border="1" cellspacing="0" cellpadding="0" class="MsoNormalTable" style="border-collapse: collapse; border: medium none"><tbody><tr><td width="64" valign="top" style="border-right: #f0f0f0; padding-right: 5.4pt; border-top: windowtext 2.25pt solid; padding-left: 5.4pt; background: #4bacc6 0% 50%; padding-bottom: 0cm; border-left: #f0f0f0; width: 47.65pt; padding-top: 0cm; border-bottom: windowtext 2.25pt solid; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial"><strong><span style="color: white; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">&nbsp;</font></span></strong></td><td width="170" valign="top" style="border-right: #f0f0f0; padding-right: 5.4pt; border-top: windowtext 2.25pt solid; padding-left: 5.4pt; background: #4bacc6 0% 50%; padding-bottom: 0cm; border-left: #f0f0f0; width: 127.85pt; padding-top: 0cm; border-bottom: windowtext 2.25pt solid; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial"><strong><span style="color: white; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">Description</font></span></strong><strong><span style="color: white; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">&nbsp;</font></span></strong></td><td width="335" valign="top" style="border-right: #f0f0f0; padding-right: 5.4pt; border-top: windowtext 2.25pt solid; padding-left: 5.4pt; background: #4bacc6 0% 50%; padding-bottom: 0cm; border-left: #f0f0f0; width: 250.95pt; padding-top: 0cm; border-bottom: windowtext 2.25pt solid; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial"><strong><span style="color: white; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">&nbsp;</font></span></strong></td></tr><tr style="height: 36.85pt; page-break-inside: avoid"><td rowspan="7" width="64" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; background: #4bacc6 0% 50%; padding-bottom: 0cm; width: 47.65pt; padding-top: 0cm; height: 36.85pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; border: #f0f0f0"><strong><span style="color: white; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">Benefits</font></span></strong></td><td width="170" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; background: #d8d8d8 0% 50%; padding-bottom: 0cm; width: 127.85pt; padding-top: 0cm; height: 36.85pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; border: #f0f0f0"><span style="font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">Low Total Cost of Ownership (TCO)</font></span></td><td width="335" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; background: #d8d8d8 0% 50%; padding-bottom: 0cm; width: 250.95pt; padding-top: 0cm; height: 36.85pt; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; border: #f0f0f0"><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">No up-front hardware or software costs</span></p><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Significantly less work effort to set-up a B2B integration solution since it involves&nbsp;mostly configuration tasks rather than programming</span></p><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Free use of online development interface&nbsp;for developers&nbsp;</span></p><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Data processing rates for usage&nbsp;are world&rsquo;s best</span></p></td></tr><tr><td width="170" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; padding-bottom: 0cm; width: 127.85pt; padding-top: 0cm; background-color: transparent; border: #f0f0f0"><span style="font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">Best Return on Investment (ROI)</font></span></td><td width="335" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; padding-bottom: 0cm; width: 250.95pt; padding-top: 0cm; background-color: transparent; border: #f0f0f0"><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">ROI achieved sooner due to low up-front and on-going costs </span></p><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Optimizes work effort since tasks removed or simplified</span></p><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Higher ROI due to removal of costs</span></p></td></tr><tr><td width="170" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; background: #d8d8d8 0% 50%; padding-bottom: 0cm; width: 127.85pt; padding-top: 0cm; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; border: #f0f0f0"><span style="font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">Speed of Delivery</font></span></td><td width="335" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; background: #d8d8d8 0% 50%; padding-bottom: 0cm; width: 250.95pt; padding-top: 0cm; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; border: #f0f0f0"><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Solutions delivered in days and weeks rather than months and years</span></p><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">No requirement to establish and maintain hardware and software</span></p><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Support for standards reduces need for specialists and training</span></p></td></tr><tr><td width="170" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; padding-bottom: 0cm; width: 127.85pt; padding-top: 0cm; background-color: transparent; border: #f0f0f0"><span style="font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">Control and Flexibility</font></span></td><td width="335" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; padding-bottom: 0cm; width: 250.95pt; padding-top: 0cm; background-color: transparent; border: #f0f0f0"><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Developers have full control over tenancies, design data and administration</span></p><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Development can be done anywhere at anytime</span></p></td></tr><tr><td width="170" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; background: #d8d8d8 0% 50%; padding-bottom: 0cm; width: 127.85pt; padding-top: 0cm; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; border: #f0f0f0"><span style="font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">Guaranteed Service </font></span></td><td width="335" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; background: #d8d8d8 0% 50%; padding-bottom: 0cm; width: 250.95pt; padding-top: 0cm; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; border: #f0f0f0"><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Secure and reliable infrastructure </span></p><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Guaranteed service level</span></p><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Data-back-up and disaster recovery provided </span></td></tr><tr><td width="170" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; padding-bottom: 0cm; width: 127.85pt; padding-top: 0cm; background-color: transparent; border: #f0f0f0"><span style="font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">Market Leading Service</font></span></td><td width="335" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; padding-bottom: 0cm; width: 250.95pt; padding-top: 0cm; background-color: transparent; border: #f0f0f0"><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Most advanced functionality</span></p><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">First remotely configurable Integration, BPM and BI service</span></p><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Wide support for industry standards and customizations</span></p></td></tr><tr><td width="170" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; background: #d8d8d8 0% 50%; padding-bottom: 0cm; width: 127.85pt; padding-top: 0cm; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; border: #f0f0f0"><span style="font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">Future Proof</font></span></td><td width="335" valign="top" style="padding-right: 5.4pt; padding-left: 5.4pt; background: #d8d8d8 0% 50%; padding-bottom: 0cm; width: 250.95pt; padding-top: 0cm; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial; border: #f0f0f0"><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Quarterly releases ensure up-to-date functionality </span></p><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Upgrades are our responsibility</span></p></td></tr><tr><td width="64" valign="top" style="border-right: #f0f0f0; padding-right: 5.4pt; border-top: #f0f0f0; padding-left: 5.4pt; background: #4bacc6 0% 50%; padding-bottom: 0cm; border-left: #f0f0f0; width: 47.65pt; padding-top: 0cm; border-bottom: windowtext 2.25pt solid; -moz-background-clip: -moz-initial; -moz-background-origin: -moz-initial; -moz-background-inline-policy: -moz-initial"><strong><span style="color: white; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">&nbsp;</font></span></strong></td><td width="170" valign="top" style="border-right: #f0f0f0; padding-right: 5.4pt; border-top: #f0f0f0; padding-left: 5.4pt; padding-bottom: 0cm; border-left: #f0f0f0; width: 127.85pt; padding-top: 0cm; border-bottom: windowtext 2.25pt solid; background-color: transparent"><span style="font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;"><font size="2">Expert Assistance</font></span></td><td width="335" valign="top" style="border-right: #f0f0f0; padding-right: 5.4pt; border-top: #f0f0f0; padding-left: 5.4pt; padding-bottom: 0cm; border-left: #f0f0f0; width: 250.95pt; padding-top: 0cm; border-bottom: windowtext 2.25pt solid; background-color: transparent"><p style="margin: 0cm 0cm 4pt" class="MsoBodyText"><span style="font-size: 8pt; font-family: &#39;Calibri&#39;,&#39;sans-serif&#39;">Expertise and knowledge available for support, development, consulting and training</span></p></td></tr></tbody></table>

With a little CSS coding (held in an external file), it has become this:


<table class="featuretable">
  <tbody>
    <tr>
      <td class="ftop"></td>
      <td class="ftop">Description</td>
      <td class="ftop"></td>
    </tr>
    <tr>
      <td class="fside">Benefits</td>
      <td class="fd0">Low Total Cost of Ownership (TCO)</td>
      <td class="fd0">No up-front hardware or software
costs<br>
Significantly less work effort to set-up a B2B integration solution
since it involves mostly configuration tasks rather than programming<br>
Free use of online development interface for developers <br>
Data processing rates for usage are world’s best</td>
    </tr>
    <tr>
      <td class="fside"></td>
      <td class="fd1">Best Return on Investment (ROI)</td>
      <td class="fd1">ROI achieved sooner due to low
up-front and on-going costs<br>
Optimizes work effort since tasks removed or simplified<br>
Higher ROI due to removal of costs</td>
    </tr>
    <tr>
      <td class="fside"></td>
      <td class="fd0">Speed of Delivery</td>
      <td class="fd0">Solutions delivered in days and
weeks rather than months and years<br>
No requirement to establish and maintain hardware and software<br>
Support for standards reduces need for specialists and training</td>
    </tr>
    <tr>
      <td class="fside"></td>
      <td class="fd1">Control and Flexibility</td>
      <td class="fd1">Developers have full control over
tenancies, design data and administration<br>
Development can be done anywhere at anytime</td>
    </tr>
    <tr>
      <td class="fside"></td>
      <td class="fd0">Guaranteed Service</td>
      <td class="fd0">Secure and reliable infrastructure<br>
Guaranteed service level<br>
Data-back-up and disaster recovery provided</td>
    </tr>
    <tr>
      <td class="fside"></td>
      <td class="fd1">Market Leading Service</td>
      <td class="fd1">Most advanced functionality<br>
First remotely configurable Integration, BPM and BI service<br>
Wide support for industry standards and customizations</td>
    </tr>
    <tr>
      <td class="fside"></td>
      <td class="fd0">Future Proof</td>
      <td class="fd0">Quarterly releases ensure up-to-date
functionality<br>
Upgrades are our responsibility</td>
    </tr>
    <tr>
      <td class="fside"></td>
      <td class="fd0">Expert Assistance</td>
      <td class="fd0">Expertise and knowledge available
for support, development, consulting and training</td>
    </tr>
  </tbody>
</table>

Old version: 12250 bytes.

New version: 2490 bytes + 605 bytes of CSS. And much more maintainable, and it’ll be easier to change the table styles later.

OK, the new looks slightly different to the old (this was on purpose to enlarge the fonts a bit), but jeez.

CSS for table displays

After much time swearing over how to get a table-like display out of CSS, I was stumped. All I wanted was an definition with the label on the left hand side and the text on the right, wrapped into a column.

And let me tell you: given my limited knowledge of it, wrestling with CSS is not my idea of fun.

Finally after scouring Google for various terms, I did a search for “hanging indents” which led me to a good way to do it using dd dt and dl tags, and appropriate CSS for each. Eureka! (Yes, if I’d thought about it, these tags are for “definitions”, precisely what I was trying to do.)

Thank you, the good people at Max Design.

Toys “R” Stupid

Want to see some HTML Form stupidity? Go to http://www.toysrus.com.au/site/signUp.htm and you get:

The stupidist HTML form I've seen in a while

Radio buttons – users know what to expect from them. You can pick only one option. Not these puppies. These happen to be round checkboxes – that you can only turn on. You can’t turn them off! Oh, sure, there’s a “reset” button down the bottom of the form, but can you recall the last time you pressed the “reset” button on a form? I don’t think, in my many many years using the ‘net, I ever have. Not once. I have “reset button blindness”, and I imagine a bunch of others do too.

To top this off, because the site is mainly Flash, figuring out what the address of the page took a while. In the end I had to bookmark it to find it.

I guess that’s what happens when you get schoolchildren to build your website.

Where are the aliens?

Coffee drinkers are easier to persuade.

Fermi’s Paradox is explained by aliens getting adicited to computer gaming.

Strom reckons he knows how to make money with a website: ads! Plus a little other stuff.

An Irishman has a rather good summery of how to negotiate an intial salary.

Cross-platform rounded corners without images, extra markup nor CSS. The holy grail of web-design dweebs.

Hassles with background-image and font sizes

The other day I was working on upgrading the eVision web site to the new look (as well as the latest WordPress 2.02). While I’ve been using HTML for more than a decade, I have to admit, my grasp on CSS is patchy. I’m still picking it up. So it took a bit of wrestling to get it to (more or less) match the design provided by the graphic designer. The big graphic still isn’t in quite the right spot, but no matter, it’s still a vast improvement over the old one.

I did learn a couple of (possibly) valuable tips:

  • In Firefox, the background-image of a div doesn’t display in the portion of the div that has nothing in it. In my case, I had a UL (which forms the dropdown menu) in there, right justified. The background only appeared in the left hand bit in IE, not Firefox. I had to add a &nbsp; to it to get it to appear… and then I had to specify a height, so the background image would go to the right height, instead of just the nominal height of the non-existent text.
  • Font sizes… after complaints from a colleague who is keen on big text, I had to remove all the references to pt sizes in text, in favour of em, so that IE would resize the text when asked. Firefox handles this even if you’ve got all pt sizes.
  • I also learnt I need to study CSS a bit more. The next projects will be doing some more upgrades and new WordPress themes, I think. I’ve got a few that need doing.

NVU

I’ve been playing around with the NVU web page editor, an open source application available for Windows, Mac and Linux. So far it’s good stuff, certainly rivalling Frontpage, and heaps better for new users looking for something cheap or free other than MS Word (which has well-known problems).

Now up to version 1.0, it probably isn’t on a par with Dreamweaver, but for basic WYSIWYG web page editing, definitely worth a look.

Office’s garbled HTML

Brian Jones on why Microsoft Office 2000 (and later) produces such godawful HTML:

Our scenario was that people would start saving “docs” as HTML on their intranet sites and browse them with the browser. We viewed the browser as “electronic paper” that we had to “print” to (i.e. perfect fidelity). We had already got a lot of feedback from our Word97 Internet Assistant add-in that any loss of fidelity when saving as a web page was unacceptable and a “bug”. As it turned out, this usage scenario did not become as common as we thought it would and a zillion conspiracy theories formed about why we “really” did it. Many people assumed that a better approach would have been to save as “clean” HTML even if the result did not look exactly like what the user saw on the screen. We felt that the core office applications (other than FrontPage) were not really meant to be web page authoring tools, so we focused on converting docs to exact replicas in HTML. We didn’t want people losing any functionality when saving to HTML so we had to figure out a way to store everything that could have existed in a binary document as HTML. We thought we were clever creating a bunch of “mso-” css properties that allowed us to roundtrip everything. HTML didn’t take off in the same way we had expected, and today, the main use for Office HTML is for interoperability on the clipboard, though of course the biggest use is within e-mail (WordMail).

None of this explains why Office 2003’s “Filtered HTML” is so riddled with proprietary tags, though. Admittedly, a filtered HTML file is smaller than a roundtrip HTML file out of Word, but it’s still hugely bigger than the type of HTML you’d write from scratch (or in a web page editor such as Dreamweaver or Frontpage), and the source code is unreadable.

To my mind, Filtered HTML should be just that: HTML, filtered in such a way that the basic structure of the document is preserved, but none of the junk that Word (or whatever) stores along with it. Leave that for the roundtrip HTML — though I can’t see the appeal in that either, since if you want to store documents in a viewable form on the great InterWeb, PDF is the way to go. Or just store it in the native Office format for internal use, when you know every user will have the application or a viewer.

Word warning(By the way, when I was trying out the roundtrip HTML the other day, while reloading, Word presented me with a strange warning that it was going to query from some nonsense “Z” table to put data in the document. Bizarro. The test document did quote some SQL, but this would seem to suggest the roundtrip HTML isn’t all it’s cracked up to be.)

Anyway, Brian’s full article is about the progression of the Office formats from binary in the 90s into the XML to be used in the next version. Well worth a read if you want some background on the history, and where they’re going now.

Cleaning up HTML out of Office

I found a good guide to cleaning out the gunk that’s in Word’s HTML documents. For the smallest most efficient files it seems to conclude that the Textism Wordcleaner — free for files under 20Kb; for bigger files subscription options are available. This issue has been causing me some angst for some time, and one of these days I’m going to bash out a tool for this myself. (Don’t hold your breath.)