10 Nov

Plowing through Unstructured Data

Jon Udell just experienced some of the practical limitations getting in the way of sharing and representing structured data easily that I’ve been running into myself. To produce an entry about Circuit City’s store closures, I had to spend a lot of time massaging the source data (coming from a PDF) in Excel so that it was properly mappable and chartable. Tasks that add little value and should take 5 minutes easily balloon into hours of menial work to renormalize and restructure data that should have been published as csv or xml in the first place. “Fake” digital content is going to get in the way of publishers for the foreseeable future. The challenge is to optimize workflow to get a decent production cost/time for enhanced news coverage. It’s all about making things replicable.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>