MeetUp Data Science Northeast Netherlands: December 15, 2016
Mydatafactory wants to bring together starting and experienced developers and researchers in the area of data science in an informal setting.
A great opportunity to share ideas, experiences and to meet the Mydatafactory team and (core) technology!
18:00 - 18:30 Pizzas, drinks, networking
18:30 - 20:00 (planning to be finalized):
- Searching CV's using Elasticsearch - Henning Rode
- Applying (Big Data) techniques and tools at Kadaster - Michael Karsters
- What we can learn about social events from Twitter - Florian Kunneman
20:00 - 21:00 More drinks & networking
Searching CV's using Elasticsearch - Henning Rode
Scaling up their search solutions to smoothly search in millions of CV's and job postings, Textkernel decided to switch to Elasticsearch as their backend search engine. In the talk I'll share some experience over the transition process to Elasticsearch: the motivation behind the engine switch, transition highlights (what went well, what went wrong), and the work impact. Thereafter, I will go in detail about our self-developed scoring plugin for Elasticsearch improving (at least in our case) over the standard retrieval methods provided by Lucene.
Henning Rode is lead search developer at Textkernel. After finishing his PhD and a PostDoc position about semi-structured document retrieval, he switched to industry for developing efficient and innovative search solutions with the Textkernel team. He is currently diving into learning to rank approaches to further improve the relevancy of search results. Besides, programming he loves canoeing, hiking, and singing in choirs.
Applying (Big Data) techniques and tools at Kadaster - Michael Karsters
Abstract & Bio
Michael Karsters works since 1998 at Kadaster, from 2008 until now at the department for special products, solutions and consultancy (GMA). GMA finds solutions for questions raised by our customers. We research evolving technical tools for process efficiency. We are very interested in techniques and tools used within the Big Data environment. Kadaster uses Splunk for log stashing, GMA uses Elastic Search. Currently we are interested in Machine Learning, although it has a steep learning curve. All in all we take small steps, every step is presented to a broad audience. They get involved and excited which enable us to proceed and spend more time on research. I will give more detail of the above during my presentation.
What we can learn about social events from Twitter - Florian Kunneman
After automatically extracting events from the TwiNL dataset, the open Dutch archive of tweet IDs, I set out to shed light on the one important issue regarding social events: is it good to have positive expectations? In this talk I will present the Lama Events demo, as well as the first event anticipointment (high expectations followed by a letdown) index by modelling classifiers to recognize the emotions of positive expectation, satisfaction and disappointment in tweets. This research might permanently change the way in which you anticipate events.
Florian Kunneman is a researcher based at the at the Radboud University, Centre for Language Studies, and has a background in communication studies and language technology. He has recently finished his dissertation on modelling patterns of time and emotion in Twitter, and is currently working on several projects as a postdoc.