1. 程式人生 > >No more messy data: What to do when your data doesn’t match up.

No more messy data: What to do when your data doesn’t match up.

No more messy data: What to do when your data doesn’t match up.

Michael Irvine is the creator of Joiner, a tool that uses machine learning to match up your messy data, wherever it lives.

What if all your data sources spoke to each other? What if it was easy, not hard, to analyze all your data in one place? That is the promise of data warehouses like Amazon Redshift and Snowflake, and ETL (extract-transform-load) solutions like Stitch. You can bring all your other databases and flat files together in one place. That way your CRM and your product database (for example) can talk to each other, and you can use them both for analytics in a tool like Looker or Mode.

But there is a problem. A stumbling block that comes up again and again for data analysts: even if you can access multiple databases, you can’t join across them if the data doesn’t match.

even if you can access multiple databases, you can’t join across them if the data doesn’t match up.

Imagine you have a bunch of companies you work with in Salesforce. It is based around legal entity names, for example “Acme Dynamite Products LLC.” But in your actual product, they are just known as “ACME”. A naive SQL query like INNER JOIN salesforce.customers ON salesforce.customers.name = production.customers.name won’t work.

The only solution is to spend hours creating a mapping by hand, and then update it every time you add a new customer. Right? Well, in the past it was. But it was never a good solution, for obvious reasons.

That is why I’m excited to announce Joiner and its new automatic Mapping feature. Using the power of PostgreSQL and machine learning, Joiner can automatically create mappings between your databases and tables, no manual effort required.

Now, you don’t have to search through lists of names, wondering if Acme LLC is the same as ACME. The computer will do it for you in seconds! Even better than that, updating your Mappings is as simple as clicking a button.

Joiner’s mapping setup

You can use your Mappings in Joiner or download them through Joiner’s CSV export function to do VLOOKUPs in Excel or Google Sheets. In all fairness, Joiner Mappings are based on machine learning, so they’re not perfect and may still require some cleaning. However, it will hopefully be a lot less work than ever before. Get started today!