Thursday 27 September 2012

[dcphp-dev] Master Retailer Database -- MediaWiki? OSM?

I have data from a bunch of different government agencies regulating
retail outlets. Most (but not all) of them have some sort of internal
identifier, but it's maddening to try to get reports with data where
the name or address is slightly different, and of course there's no
master id.

So I'm trying to put together a master database, with our own
identifier. I figure we'll go through some address standardization
for the first pass. It gets more complicated, though, when store
names change (e.g. when a business is sold, the address stays the same
but the name doesn't) or moves, or is added, whatever. We're talking
about making opening the data up under the Open Database License, and
I'd like to use some sort of standard tool.

MediaWiki came to mind first. It'd be a great UI, and would allow
people to update the site with specific information like if the store
was no longer in business. But the majority of the data updates would
come from merging lists from various agencies. On the huge plus side,
we have an API for access and updating, and a built-in history tool.

OpenStreetMap was my second idea. Many of the same benefits, but a
bit more awkward to work with. The huge advantage is that the data
relevant to OSM we get from the agencies could be pushed back to the
OSM database. But I'm not sure how to handle our project-specific
data (e.g. violations) which is clearly of no interest to OSM.

I'm leaning toward MediaWiki, and was wondering if anyone had any
experience with managing what is more often stored in a relational
database. I'm trying to avoid writing a full site for this, but it's
tempting to start with name, address, phone, etc., but then I'd have
to manage all the history and wiki nature of this. So I'm looking for
a combination of the free-flow wiki-style data and a structured
database.

I worked on something a while ago where we just enforced headers,
which roughly mapped to each field. But that felt a bit hackish.
It's more like I want a form within a mediawiki page.

We will write some plugins for linking to data that comes directly
from a database, like the violations themselves, but the common data
is what I'm thinking about now.

Any pointers?

Thanks!

Tac

--
You received this message because you are subscribed to the Google
Group: "Washington, DC PHP Developers Group" - http://www.dcphp.net
To post, send email to washington-dcphp-group@googlegroups.com
To unsubscribe, send email to washington-dcphp-group+unsubscribe@googlegroups.com
For more options, visit this group at http://groups.google.com/group/washington-dcphp-group?hl=en

0 comments:

Post a Comment