Category Archives: Data

Summer 2014 starts

Given recent events pointed out by DavidC, I declare Summer 2014 has started. Our traditional, mid-year Summer.

Wednesday 14 May    Min 10    Max 21    Mostly sunny.
Thursday  15 May    Min 13    Max 22    Partly cloudy.
Friday    16 May    Min 14    Max 22    Mostly sunny.
Saturday  17 May    Min 13    Max 22    Mostly sunny.
Sunday    18 May    Min 12    Max 21    Partly cloudy.
Monday    19 May    Min 12    Max 21    Sunny.
Tuesday   20 May    Min 14    Max 21    Partly cloudy.

Good thing you guys voted in that Abbott government.

The upside of climate change is that I get to paint the house this week. Two weeks before the start of Winter.

The poor are failed by the loss of obsolete medical procedures

The following rant comes courtesy of a speaker to a group of volunteer developers working on OpenMRS, who recounted her experiences of volunteering as a doctor in India.

Naturally, when you go under the knife for a surgical procedure, you’d want the surgeon using the latest, most advanced techniques, as demonstrated by empirical evidence.  Health systems want the surgeons to use the most efficient technique, expressed in positive outcomes per money spent.  You’d expect that in today’s world, you’d get one of the two, or perhaps somewhere in between.

Say that the latest technique uses robo-surgeons. Let’s call that technique Z.  It was pioneered in a university teaching hospital at enormous cost, because they’d never built one before; there’s no commercial provider of the equipment yet, so technique Z hasn’t percolated to wider practice.  Most other hospitals use techniques X or Y, one requiring more, highly trained staff, and the other requiring fewer staff but a couple of expensive pieces of equipment. Techniques X and Y are variations on T, U, V and W, some of which date back to the early sixties, and stem off from technique S.  If you look at textbooks, S is mentioned by name, and T, U, V and W have one- or two-sentence descriptions because while major leaps forward at the time, they’re now obsolete in the era of X and Y.  The medical textbooks describe how to do X and Y in detail.

In developing countries, you don’t have either the many staff, the highly trained staff or the expensive pieces of equipment.  U, V and W are all unavailable because of this. T uses equipment that can’t even be procured any more and certainly isn’t lying around waiting to assist with surgery now.

The developing world needs medical and surgical texts that don’t demand powerful diagnostic tools, expensive equipment or highly specialized staff.  A competent surgeon can do their work without any of these; they’ll get worse expected outcomes, but those outcomes will be better than inaction.  There are no textbooks currently available to instruct a surgeon with limited resources.  Even battlefield surgeons expect to stabilize their patient and ship them off to much better hospitals.

The ongoing progress in medicine is leaving behind the poorest and most vulnerable on our planet; our indifference to the preservation of these old methods are affecting us now, in ways I would never have guessed at.

Summer 2013/2014 ends

The seven-day forecast for Melbourne makes today the last day of Summer:

Wednesday                    Max 16    Showers mainly this morning.
Thursday  1 May    Min 7     Max 18    Partly cloudy.
Friday    2 May    Min 11    Max 15    Rain at times.
Saturday  3 May    Min 8     Max 14    Shower or two.
Sunday    4 May    Min 10    Max 15    Shower or two.
Monday    5 May    Min 11    Max 16    Shower or two.
Tuesday   6 May    Min 10    Max 15    Mostly dry.

Of course, Summer persists while any temperature in a week is 20 degrees or above.

Allow more JavaScript, maintain privacy

I’ve long regarded JavaScript in the browser to be one of the biggest security holes in web-browsing, and at the same time the Internet works less and less well without it. In 2008 Joel Spolsky made the observation that for some people the Internet is just broken:

Spolsky:   Does anybody really turn off JavaScript nowadays, and like successfully surf the Internets?

Atwood:   Yeah, I was going through my blog…

Spolsky:   It seems like half of all sites would be broken.

Which is not wrong.  Things have changed in the last five years, and now the Internet is even more broken if you’re not willing to do whatever random things the site you’re looking at tells you to, and whatever other random sites that site links off to tell you to, plus whatever their JavaScript in turn tells you to. This bugs me because it marginalizes the vulnerable (the visually impaired, specifically), and is also a gaping security hole.  And the performance drain!

Normally I rock with JavaScript disabling tools and part of my tin-foil-hat approach to the Internet, but I’m now seeing that the Internet is increasingly dependent on fat clients. I’ve seen blogging sites that come up empty, because they can’t lay out their content without client-side scripting and refuse to fall back gracefully.

So, I need finer granularity of control.  Part one is RequestPolicy for FireFox, similar to which (but not as fine-grained) is Cross-Domain Request Filter for Chrome.

The extensive tracking performed by Google, Facebook, Twitter et al gives me the willys. These particular organisations can be blocked by ShareMeNot, but the galling thing is that the ShareMeNot download page demands JavaScript to display a screenshot and a clickable graphical button – which could easily been implemented as an image with a href. What the hell is wrong with kids these days?

Anyway, here’s the base configuration for my browsers these days:

FireFox Chrome Reason
HTTPSEverywhere HTTPSEverywhere Avoid inadvertent privacy leakage
Self Destructing Cookies “Third party cookies and site data” is blocked via the browser’s Settings, manual approval of individual third party cookies. Avoid tracking; StackOverflow (for example) completely breaks without cookies
RequestPolicy Cross-Domain Request Filter for Chrome Browser security and performance, avoid tracking
NoScript NotScripts Browser security and performance, avoid tracking
AdBlock Edge Adblock Plus Ad blocking
DoNotTrackMe DoNotTrackMe Avoid tracking – use social media when you want, not all the time
Firegloves (no longer available), could replace with Blender or Blend In I’ve have had layout issues when using Firegloves and couldn’t turn it off site-by-site

Summer 2013/2014 starts

The current 7-day forecast for Melbourne:

Friday   30 August           Max 20 Shower or two.
Saturday 31 August    Min 12 Max 23 Sunny.
Sunday    1 September Min 15 Max 25 Partly cloudy.
Monday    2 September Min 12 Max 23 Partly cloudy.
Tuesday   3 September Min 11 Max 25 Partly cloudy.
Wednesday 4 September Min 16 Max 26 Shower or two developing.
Thursday  5 September Min 16 Max 20 Shower or two.

I declare summer whenever there’s going to be 7 consecutive days in a row above 19 degrees.  Previously, the earliest Summer has started was mid-September, but typically it’s been moving forward from October or November.

Remember we’ve got an election coming up in a week’s time, and that’s your opportunity to repeal the carbon tax.  Which we need to do, to keep lovely balmy weather happening in winter-time and to keep the cost-of-living down.  Remember: carbon-dioxide is food for plants, and as such good for the environment, which is made out of plants. That’s just science.

Replace a missing remote control with an Arduino and a laptop

I recently found myself without a remote for my WDTV Live media player, and limited resources to do anything about it – but I did have an Arduino, a breadboard and the local Jaycar had an IR LED.  Controlling IR devices is common practice with an Arduino. I would even be able to hack in functions that didn’t exist on the manufacturer’s remote – like creating a three minute skip by switching to 16x speed for 12 seconds.

The first port of call was to obtain Ken Shirriff’s Arduino IR remote control protocol library – as opposed to communications protocols, of which there are quite a number; did you know the first cut of WiFi included an infrared version? Without the remote, I wasn’t able to record and playback the IR signals sent to the WDTVLive, as you would with a learning remote. I had to find what to transmit from my custom remote. I little googling and I found the WD TV Live infrared remote control codes, which also helpfully reveals that the protocol is NEC.

I knocked up a quick proof of concept, installed it and watched it not work. Given I can’t see in infrared, I didn’t know if my circuit was working. I hooked a red LED up in parallel, and it didn’t light up; I thought I had cathode and anode swapped around, so flipped the red LED – and it didn’t light up. I pulled the IR LED, and then the red LED worked… I was shorting out the red LED. I couldn’t – with the bits I had lying around – confirm the device was transmitting anything. Rather than put the LEDs in series, I got a cheap camera-phone with video function, and it could see IR just fine. And it turns out the IR LED was transmitting something, but the WD TV Live media player wasn’t listening. Why?

The NEC infrared control protocol transmits 32 bits in one of two formats, one old (as in elderly) format encodes for 256 devices with 256 commands each, and the other encodes for ~64K devices with 256 commands each. The first 16 bits encode the device, and the second 16 bits encode the command. 16 bits for one of 256 commands, you ask? Well, one byte of the second 16 bits is the command, and the other is – for error checking purposes – the one’s complement of that. Further details of the pulse timing and protocol contents are available in various places, but they neglect to mention the extended addressing format. There are many IR control protocols. To use Ken’s IR library you need to know which protocol is used (which the google search revealed), and you can determine the protocol from the timing data found in the LIRC definition of a protocol, in this case the LIRC infrared control protocol for WDTV Live media player remote. The LIRC protocol defintion format is described by WinLIRC, so you can see what the timings are. In this case, the NEC protocol is revealed by the header, one and zero definitions, along with the fact that each code has 16 bits of ‘pre-data’ and 16 bits of data (a 32 bit package). Everything I could see was showing that the two, separately arrived at sets of command codes that were empirically sampled from the real world were compliant with the spec. One of the things the spec taught me was to transmit the NEC code twice, and to wait 70ms between re-transmissions.

I wasted time finding other codes for the remote, in other formats; I checked for byte ordering issues. Nothing worked.

The actual problem was the unsigned long for the command was previously an int; failing to notice this simple error led me to spend a long time trying to figure out why nothing was happening when I transmitted a command. One of the problems with the C language is the guarantees about data sizes aren’t worth much.  My entire life has been spent programming on architectures that have 32 bit data words; C compilers on these machines have all defined an int as 32 bits, but I’ve always been aware that the language spec says that an int is at least as wide as a short, which is at least as wide as a char with actual widths being up to the compiler implementation (although why you’d have different words for things of the same size is beyond me).  The AVR microcontroller in question has an 8 bit word; mathematical operands typically yield an 8 bit result (multiply is an exception) with compilers needing to implement more instructions to yield greater data widths. The defines express the codes as four byte values, which were then wrangled into a two byte int, and then again into unsigned four byte integer when passed to the IR library. Truncated bits in a protocol like this were the cause of inactivity.

Even with this fundamental problem solved, confusion was added by the fact that one of the memory cells in my Arduino is faulty. Once IR control code transmission was working, I noticed that sometimes it didn’t work. I decided to echo the command to the serial port, and the command being transmitted didn’t match that for the key pressed – the second byte was wrong. I added code to work around this memory corruption (not shown in the code below, because this is a pretty unusual). I’ve never come across this kind of problem before, recognising and then solving something like that is pretty old-school.

/*
Pin 3 is hard-wired into the IR library as the emitter
 */
#include <IRremote.h>
//#define DEBUG

IRsend irsend;

#define btn_enter  0x219E10EF
#define btn_right  0x219E906F
#define btn_left   0x219EE01F
#define btn_down   0x219E00FF
#define btn_up     0x219EA05F
#define btn_option 0x219E58A7
#define btn_back   0x219ED827
#define btn_stop   0x219E20DF
#define btn_rew    0x219EF807
#define btn_ff     0x219E7887
#define btn_play   0x219E50AF
#define btn_prev   0x219E40BF
#define btn_next   0x219E807F
#define btn_eject  0x219E08F7
#define btn_search 0x219EF00F
#define btn_home   0x219E609F
#define btn_power  0x219E48B7

// Pin 13 has an LED connected on most Arduino boards.
// give it a name:
const int onboard_led = 13;
const int retransmit=2;
unsigned long play_after=0;

void setup()
{
  pinMode(3, OUTPUT);     
  pinMode(onboard_led, OUTPUT);     
  Serial.begin(9600);
  Serial.println("WDTV Live serial controlled IR remote");
  Serial.println("~ Power    Eject ^ & Search   Rew - + FF");
  Serial.println("  w         Back q e Enter   Play  P");
  Serial.println("a s d (Arrows)     x Stop    Last < > Next");
  Serial.println("3 - FastForward three minutes");
}

void loop() {
  unsigned long cmd=0;
  if (Serial.available()) {
    switch (Serial.read()) {
      case 'E':
      case 'e':
      case ')':
      case '0':
      case 'O':
      case 'o': cmd=btn_enter; break;
      case 'q':
      case 'Q': cmd=btn_back; break;
      case 'P':
      case 'p':
      case ' ': cmd=btn_play; break;
      case 'S':
      case 's': cmd=btn_down; break;
      case 'W':
      case 'w': cmd=btn_up; break;
      case 'A':
      case 'a': cmd=btn_left; break;
      case 'D':
      case 'd': cmd=btn_right; break;
      case '-':
      case '_': cmd=btn_rew; break;
      case '=':
      case '+': cmd=btn_ff; break;
      case ',':
      case '< ': cmd=btn_prev; break;
      case '.':       
      case '>': cmd=btn_next; break;
      case '/':
      case '?': cmd=btn_option; break;
      case '~': cmd=btn_power; break;
      case '!':
      case '1': cmd=btn_home; break;
      case '^':
      case '6': cmd=btn_eject; break;
      case '*':
      case '8': cmd=btn_search; break;
      case 'x':
      case 'X': cmd=btn_stop; break;
      case '3': 
        if (!play_after) play_after=4; break;
    }
  }
  if (play_after > 0) {
    if (cmd) {
      play_after=0;
    }
    else if (play_after > 5) {
      if (play_after < millis()) {
        cmd=btn_play;
        play_after=0;
      }
    }
    else {
      cmd=btn_ff;
      if (--play_after == 0) {
        play_after=millis()+12000;
      }
    }
  }
  if (cmd) {
    digitalWrite(onboard_led, HIGH);   // turn the LED on to indicate activity
    for (int i = 0; i < retransmit; i++) {
      irsend.sendNEC(cmd, 32);
      delay(70);
    }
#ifdef DEBUG
    Serial.println(cmd, HEX);
#endif
    digitalWrite(onboard_led, LOW);    // turn the LED off - we're done transmitting
  }
}

In other links, How-To: IR Remote Control your Computer

Flooding with water

So, looking at properties, and a number are down on the floodplain near the local moving body of water, a river/creek.  I wonder to myself if the area is at any risk from floodwater; should I even bother looking at the area?

The council, being the government body most connected to the area, ought to know.  It doesn’t; it can’t tell me except to tell me if a specific property has a flood-overlay, which says that modelling has determined that it is at risk of a 1 in 100 year flood.

What is the 1 in 100 year flood event?

The 1 in 100 year flood event is the storm that happens on
average once every one hundred years (or a 1% chance of
occurring in any given year).

Now, that means in any given year there’s a 99% chance you’re not going to get flooded.  In 100 years, that means a 0.99100 or a 36.6% chance of not getting flooded. A 2/3 chance of having water washing through your home at some point there.  Basically, that’s a guarantee that in the next century your home will be damper than normal – because the 1 in 100 year events are calculated off historic data, not forward climate models.  And the forward models say that things are only going to get more extreme; have you noticed how 1 in 100 year events seem to happen to the same place every decade or so?

In fact, pretty much anyone you talk to – water utilities for example – will only talk about 1 in 100 events. Vital government infrastructure (stuff that has to keep operating the event of a flood disaster, like hospitals and my home) has to be above the 1 in 500 line. From what I’m told, they calculate this on a site-by-site basis rather than having a map (they’re not building a bunch of new hospitals, so it’s easier that way).  Sites aren’t rated as being 1 in 110 year, you’re either in the 100 year box or not rated at all.

The gist of what I was able to read into the subtext of the hints being passed in my conversation with a town planner specializing in flooding was: Floodplains get flooded, even in cities, even if there’s a wetlands further upriver that could absorb a sudden influx of water, even if the sides of the creek are quite steep and the channel is surprisingly broad, and even if there are barricades; If you don’t like that, don’t live there.

So I won’t.  It makes searching for a home so much easier, even if the homes out of the floodplain are more expensive and built on those annoyingly sloped hill things.

Actually, this reminds me of the 1972 Elizabeth St Floods my Mum told me about getting caught in. I would never have guessed a major street in our CBD could turn into a river – and then it happened again in 2010.

Summer 2011/2012 starts

I declare summer whenever there’s going to be 7 consecutive days in a row above 19 degrees.  And as such:

Friday            Max 22  Shower or two.    
17/09/11  Min 10  Max 25  Shower or two developing.    
18/09/11  Min 12  Max 21  Sunny.    
19/09/11  Min 11  Max 27  Showers developing. Windy.
20/09/11  Min 12  Max 20  Shower or two.     
21/09/11  Min 13  Max 22  Morning shower or two.    
22/09/11  Min 12  Max 24  Mostly sunny.

Summer.

In September.

It’s a good thing that global warming is a beat up by the greens, a front for communist interests trying to take control of our lives and introduce excessive and unneeded regulation – or else I might be worried.

Census night is coming

The census delivery chick turned up and offered us the option of paper or electronic form.

Two programmers looked at each other, thought about how they value their time and the response was a no-brainer:

“We’re programmers,” I explained, “we’ll take the paper form.”

“There’s a phone number you can call if you have any trouble filling out the electronic form” reassures the collector.

Cathy thinks: “Sure, that line won’t have any trouble when twenty million Australians simultaneously log into the web site to fill in the forms via a broken SSL link, using IE specific controls (that only work under some versions Windows assuming they’re correctly patched and have the right libraries loaded), demanding full round-trips to the underspec’d Windows servers to populate unnecessarily complex custom controls, some of which will no doubt demand Flash or COM. Come to think of it, it probably won’t even be web based, and we’ve only got two Windows boxes, one of which is tucked under a table (Yay! Census night on the floor swearing at the ABS’s programmers!) and the other has a screen resolution that went out with buggy whips (I’ve had programs barf and refuse to run because the resolution was unacceptable).”

We chose paper. For another view of the world, I’m looking forward hearing to how census night worked for Daniel…

Importing into SQL Server

Alas SQL Server Management Studio isn’t as friendly as it could be for pasting in data. You’d think Microsoft would have this humming, but when I tried to paste from Excel, it attempted to paste the entire first row from my spreadsheet into the first column (in one row) of the database.

Using MS Access to open up the database probably would have worked, but I didn’t have it on that machine.

Trying to import using the SQL Server Import And Export Data wizard from a CSV text file worked for a small amount of data, but the 80,000 rows I was trying to import from the world ports code list didn’t. Time and time again it would report an error (unspecified) and give me the option of Abort, Retry, Ignore. No matter option I chose, it crashed.

While the 64-bit version of the wizard on my 64-bit Win7 machine didn’t allow you to import from Excel/Access, the 32-bit version did (presumably because MS Office, at least the version I have installed, is 32-bit).

The next problem was that it only supported Excel 2003 format, which can’t handle more than 64K rows. I ended up having to split the data into two and import the two spreadsheets separately. Then it worked.

Shame the wizard is so flaky, and of course it’s a big shame that Management Studio doesn’t do copy/paste like one would expect. (Maybe that too was a 32-bit/64-bit issue.)

AU timezones

I’m not happy when I see someone technical quoting a time in summer (eg during daylight savings) which claims to be “AEST”.

It’s almost certainly actually AEDT.

A summary of the abbreviations, which looks reasonably official, is here: Australian timezones.

In summary, AEST, ACST and AWST apply in winter. AEDT, ACST and AWST apply in summer.

The other issue I had with a recent email was it said 12:00pm AEST. I think in this context, they meant midnight AEDT, but it’s confusing. Better to either say Midnight, or use 24-hour time: 00:00.

Facebook: Download your information

Facebook downloadI had a quick look at Facebook’s Download Your Information feature — evidently added a few months ago due to criticism about the accessibility of people’s data once it’s dumped into the Facebook bottomless pit.

You can find it via the My Account screen, by clicking Download Your Information.

It asks for some time to compile all the information — in my case this took about half an hour — then emails you to say it’s ready to download, and provides a link and re-checks your password.

It comes as a single zip file, with HTML and pictures inside it.

Opening the index.html file, you’ll find a version of your Profile page, with links to all the other information in the archive, including Wall, Photos, Friends, Events, Messages.

The Wall in my case was 1.5 Mb of HTML, going back to 2007, and I suspect is every Wall post (and replies from friends) I’ve ever made. Friends is just an unlinked list of all your friends (name only). Messages has all your message threads, and replies.

You can browse the photos via the directory of the same name; subdirectories reflect the folders. It looks like all the photo files are at the size that Facebook shrunk them down to when they were uploaded.

To actually get this information into another service, you’d need to do some trickery with munging the HTML. The code they’ve used seems relatively clean and easy to parse.

So all in all, quite a handy feature, and goes a long way towards dispelling fears that information pumped into Facebook was lost forever behind a zillion clicks of to show “Older Posts”.

(It doesn’t appear that Twitter has a comparable feature.)