Wednesday, November 10, 2010

Google Refine 2.0

(Cross posted on Google Open Source blog.)

Our acquisition of Metaweb back in July also brought along Freebase Gridworks, an open source software project for cleaning and enhancing entire data sets. Today we’re announcing that the project has been renamed to Google Refine and version 2.0 is now available.

Google Refine is a power tool for working with messy data sets, including cleaning up inconsistencies, transforming them from one format into another, and extending them with new data from external web services or other databases. Version 2.0 introduces a new extensions architecture, a reconciliation framework for linking records to other databases (like Freebase), and a ton of new transformation commands and expressions.

Freebase Gridworks 1.0 has already been well received by the data journalism and open government data communities (you can read how the Chicago Tribune, ProPublica and data.gov.uk have used it) and we are very excited by what they and others will be able to do with this new release. To learn more about what you can do with Google Refine 2.0, watch the following screencasts:





14 comments:

  1. This looks awesome! Is it possible to export it to an Oracle table?

    ReplyDelete
  2. looks great. is it possible to do the following?

    1. create transformations and save in a "project"
    2. apply those transformations to newly arrived data and export to a file, all without opening google refine (eg: using a command line like "googlerefine.exe myproject.prj myoutput.csv"

    ReplyDelete
    Replies
    1. did you ever get an answer to this question? i'm interested in doing the same thing

      Delete
  3. I've had difficulty downloading it to my PC. Where do you post system requirements?

    ReplyDelete
  4. Thanks a lot for this awesome tool. 1 day of work => 1 hour. Magic.

    ReplyDelete
  5. Video #3 (on Data Augmentation) is shown as private whenever I try to play. Any chance you can fix that? The first two were helpful, and I'm looking forward to learning more.

    I was also wondering the same thing Tim asked about. Is there any way that Google Refine's functions can be automated?

    ReplyDelete
  6. Why Movie #3 isn't available anymore?

    ReplyDelete
  7. Any explanation why Video #3 has been "privatized"?

    ReplyDelete
  8. Still waiting for an answer to the last three (identical) queries.

    ReplyDelete
  9. Oh, never mind. Someone left a copy: http://code.google.com/p/google-refine/downloads/detail?name=refine-2.0-data-augmentation.mp4&can=2&q=

    ReplyDelete
  10. Your post is nice.I have liked the way you have written this.I have listed your blog in my blog directory.To check this please visit <a href="http://submitbloggettraffic.blogspot.com”>Submit Blog Get Traffic</a>

    ReplyDelete
  11. Your post is nice.I have liked the way you have written this.I have listed your blog in my blog directory.To check this please visit Submit Blog Get Traffic

    ReplyDelete
  12. Not able to see 3rd (Data Augmentation) video Error saying "This video is private"

    ReplyDelete
  13. Is Google Refine still a live project?

    ReplyDelete