MeetUp Data Science Northeast Netherlands: Januari 14, 2016
MATCHING PRODUCT DATA USING ELASTICSEARCH - DOLF TRIESCHNIGG, MYDATAFACTORY
Product data is everywhere, ranging from size and colour information about products on e-commerce websites to specifications of spare parts in enterprise databases. Finding the desired product in such a database is difficult because of the mismatch between product descriptions. Product metadata might be described or spelled differently, the same description might have multiple meanings, or vital information might be missing.
In this talk he will discuss the challenges of product search and how they deal with these issues at Mydatafactory. He will talk about how they use and adapt Elasticsearch, an open source search system, to deal with some of these problems in the context of industrial product data.
MANAGING UNCERTAINTY IN DATA: THE KEY TO EFFECTIVE MANAGEMENT OF DATA QUALITY PROBLEMS - MAURICE VAN KEULEN, UNIVERSITY OF TWENTE
Business analytics and data science are significantly impaired by a wide variety of 'data handling' issues, especially when data from different sources are combined and when unstructured data is involved. The root cause of many such problems centers around data semantics and data quality.
They have developed a generic method which is based on modeling such problems as uncertainty *in* the data. A recently conceived new kind of DBMS can store, manage, and query large volumes of uncertain data: the UDBMS or "Uncertain Database". Together, they allow one to, e.g., postpone the resolution of data problems, assess what their influence is on analytical results, etc. They furthermore develop technology for data cleansing, web harvesting, and natural language processing which uses this method to deal with ambiguity of natural language and many other problems encountered when using unstructured data.
14 January the first Data Science Northeast Netherlands meetup will take place!
Mydatafactory wants to bring together starting and experienced developers and researchers in the area of data science in an informal setting.
18:00 - 19:00 Pizzas, drinks, networking
19:00 - 19:30 Matching product data using Elasticsearch
20:00 - 20:30 Managing uncertainty in data
20:30 - 21:30 More drinks & networking