Plowing through Unstructured Data

In: web apps

10 Nov 2008

Jon Udell just experienced some of the practical limitations getting in the way of sharing and representing structured data easily that I’ve been running into myself. To produce an entry about Circuit City’s store closures, I had to spend a lot of time massaging the source data (coming from a PDF) in Excel so that it was properly mappable and chartable. Tasks that add little value and should take 5 minutes easily balloon into hours of menial work to renormalize and restructure data that should have been published as csv or xml in the first place. “Fake” digital content is going to get in the way of publishers for the foreseeable future. The challenge is to optimize workflow to get a decent production cost/time for enhanced news coverage. It’s all about making things replicable.

Comment Form

About this blog

I'm CEO of an online/mobile trade publishing firm in the marketing and defense verticals. We strive to make news and data digestible and useful in an environment that is noisier by the day.

This personal blog mixes my thoughts and interests on politics, business, publishing, software, and more. Over the years I have posted items that turned out spectacularly wrong, and a few posts that better stood the test of time.

Categories

Archives

  • chris: thanks, I googled JetBlue interview and your site pulled together a number of excellent articles in [...]
  • Tim Marman: Assuming the market exists, it seems like the biggest challenge here is getting potential customers [...]
  • Konstantinos: ...and one of your first readers from back then (Webvoice anyone?) is here to congratulate you and w [...]
  • Harun Akar: http://abnhost.com just started offering FogBugz hos [...]
  • Sergio Rebelo: "Do you mind hearing some situational irony? I found this entry through a google search." Yes... me [...]